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Long Repetitive Patterns in Random Sequences 


L.J. Guibas’ and A.M. Odlyzko? 


1 Xerox Palo Alto Research Center, Palo Alto, California 94304, USA 
? Bell Laboratories, Murray Hill, New Jersey 07974, USA 


Summary. Appearances of long repetitive sequences such as 00...0 or 
1010...101 in random sequences are studied. The expected length of the 
longest repetitive run of any specified type in a random binary sequence of 
length n is shown to tend to the binary logarithm of n plus a periodic 
function of logn. Necessary and sufficient conditions are derived to ensure 
that with probability 1 an infinite random sequence should contain repetitive 
runs of specified lengths in given initial segments. Finally, the number of 
long repetitive runs of a specified kind that occur in a random sequence is 
studied. These results are derived from simple expressions for the generating 
functions for the probabilities of occurrences of various repetitive runs. These 
generating functions are rational, and lead to sharp asymptotic estimates for 
the probabilities. . 


1. Introduction 


Runs of heads in sequences of coin tosses have been under intensive study for a 
long time (see [4; Chap. 13]). Some recent results in this subject are given in [2, 
3, 8, 9, 11]. (A good survey of many of these problems is presented in [11].) The 
purpose of this paper is to extend and improve, by the use of different methods, 
some of the results of Erdés and Révész [3] on the length of the longest head- 
run. Let X,,X,,..., be a sequence of independent and identically distributed 
random variables with Pr(X,;=0)=Pr(X;=1)=1/2, and let Q denote the basic 
space. (The extension of our results to the case of unequal probabilities and 
more than two outcomes is relatively straightforward.) Let Z, denote the length 
of the longest head-run among the first n coin tosses; i.e., Z, is the largest integer 
Z such that there is a k, 1SkSn—Z+1, with X,=X,,,=...=Xk4z_,=1. 
Erdés and Révész [3] were interested in bounds for Z, that would hold for 
almost all weQ. In some cases they obtained best possible results. 


Theorem A. Let {h,} be a sequence of positive integers. If 


ie @) 
> 2-*= 00, 


r=1 


0044-3719/80/0053/0241/$04.40 
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then for almost all weQ there is an infinite sequence n,=n,q, {h,}), i=1,2,..., of 
integers such that 
Ze, 8 twhBu., 


If, on the other hand, 


Co 
>» 2 <@, 
1 


then for almost all wEQ, there exists. a positive integer no =no(w, {h,}). such.that.. 
Z,<h, for nZno. 


Erdés and Révész were also interested in lower bounds for Z, which are 
satisfied for all n with probability 1. Since there is a positive probability that Z,, 
is less than any preassigned number, in order to obtain a result that holds with 
probability 1, one can only ask for bounds that hold for sufficiently large n. In 
this case the Erdés-Révész result was not quite as definitive as Theorem A. 


Theorem B. Let ¢>0 be given. Then for almost all weEQ there exists a finite no 
=No(@, €) such that for nZNo, 


Z,2Ugn—lgiglgn+lglge—2-—e], (1.1) 
where lg x =log,x denotes logarithm to the base 2. Also, for almost all wEQ there 
exists an infinite sequence n;=n,q, €), i=1,2,..., of positive integers such that 

Z,,, <Ugn;—lglglgn,+1glge—1+e]. (1.2) 


The gap between the upper bound (1.2) and the lower bound (1.1) is 
essentially 1. One of the purposes of this paper is to close this gap by providing 
a complete characterization (Theorem 2 below, but the notation there is dif- 
ferent) for sequences a(n) such that Z,,2«(n) holds for large n with probability 1. 
It turns out that the right bound for Theorem B is close to (1.2), and that the 
critical value for Z,, is 


Ign—Iglgign+liglge—1+0 ( 


Ben 


Ign 


Before stating our precise results, we introduce some more notation. So far, 
we have mentioned only head-runs. Our results, however, extend also to other 
repetitive patterns in the coin tossing sequence, such as sequences of alternating 
heads and tails. Let B=b,b,...b,, be a pattern (finite sequence of 0’s and 1’s or 
heads and tails) that is nonperiodic, ie., such that B cannot be written as B 
=CC...C for any pattern C that is shorter than B. (As an example, 010 is a 
nonperiodic pattern, while 0101 is not.) A B-run of length k is a sequence A 
= BB...BB' of length k, where B’ denotes a prefix of B. (Thus if B=010, then 
01001 is a B-run of length 5.) We define Z,(B) to be the length of the longest B- 
run in the first n coin tosses. We will prove best possible results about Z,(B) for 
general nonperiodic B. 

While results about B-runs are of interest in themselves, they are not 
completely satisfactory since they do not correspond too well to one’s intuition 
about what constitutes a long repetitive run. For example, if B=HT, then 
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HTHTH is a B-run of length 5, while THTHT is not. We therefore introduce the 
more felicitous concept of a B*-run. A B*-run of length k, for a nonperiodic B, is 
a paitern A=B” BB...BB’ of length k, where B’ is a prefix of B and B” is a suffix 
of B. (Thus THTHT is a B*-run of length 5 if B is any of the following: TH, HT, 
THTH, HTHT, or THTHT.) Z*(B) is then the length of the longest B*-run in the 
first n coin tosses. The analysis of Z*(B) turns out to be very much more 
complicated than that of Z,(B), but the final results turn out to be rather 
interesting. This concept of a B*-run enables us also to deal with maximal runs 
of either pure heads or pure tails. If @ is a sequence of heads or tails in which 
the longest run of either pure heads or pure tails in the first n elements is k, and 
w’ is the sequence obtained from @ by changing the elements in the even- 
numbered positions from heads to tails and vice versa, then head or tail runs in 
@ correspond to B*-runs in w’, where B= HT, and Z*(B)=k. 

We are now ready to state our results about B-runs and B*-runs. The first 


theorem corresponds to the Erdés-Révész Theorem A, although we use slightly 
different language. 


Theorem 1. Let B be a non-periodic pattern, and let {n,} be a sequence of positive 
integers. If 


yn, 2-* = 00, (1.3) 


k=1 


then for almost all weQ there is an infinite sequence k;=k,(w, {n,}), i=1,2,..., of 
integers such that 


Zi, (B)ZZ,,(B)ZK;, i=1,2,.... 
If, on the other hand, ‘ 
y n,2-* <0, (1.4) 


k= 1 
then for almost all weQ there exists a positive integer ky =k,(w, {n,}) such that 
Z,,(B)SZ*(B)<k for kZkp. 


It is rather remarkable that the results of Theorem 1 are independent of the 
precise nature of B. In fact, quite a bit more can be shown using the methods of 
this paper. Suppose that we are given a sequence {A,} of patterns, A, of length k, 
and a sequence {n,} of positive integers. Then, if (1.3) holds, for almost all weQ 
there is an infinite sequence k;=k,(@, {n,}), i=1,2,..., of integers such that the 
first n,, tosses contain the pattern A,.. If, on the other hand, (1.4) holds, then 
there exists a positive integer ky =k (a, {n,}) such that for k2k, the first n, 
tosses do not contain A,. 

While the results of Theorem 1 are insensitive to the nature of B, this is not 
the case for our other results, which improve on the Erdés-Révész Theorem B. 


Theorem 2. Suppose that B is a non-periodic pattern of length m and let {n,} be a 
sequence of positive integers. If 


Y; exp(—n,(1—2-")2-")< 0, 
k=1 
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then for almost all wEQ there is a ky=ko(@, {n,}) such that 


Z,,(B)Zk for k2ko. 
If 


¥ exp(—n(1—-2-")2-4)= 00, 


k=1 


then for almost all weQ there is an infinite sequence k;=kq, {n,}), i=1,2,..., 
such that 


Z,,,(B)<k;,, i=1,2,.... 
Theorem 3. Let B and {n,} be as in Theorem 2. If 


co 
exp(—n,m2-*-')<«o, 
k=1 


then for almost all weEQ there is a ky =ko(@, {n,}) such that 


ZR(B)Zk for kZkp. 
If 


>» exp(—n,m2-*-1)=00, 


then for almost all weQ there is an infinite sequence k;=k,(w, {n,}) such that 
Z* (B)<k,, i=1,2,.... 


It is also possible to generalize Theorem 2 in a slightly different direction. As 
in the remarks after Theorem 1, let us suppose that {A,} is a sequence of 
patterns, A, of length k, and {n,} a sequence of positive integers. When will it be 
true for almost all weQ that for all sufficiently large k, the first n, elements of w 
will contain the pattern A,? The methods used to prove Theorems 2 and 3 show 
that this will happen if and only if 


oo 
¥. exp(—maz")< 00, 
k=1 


where a, is an integer, 2*-'<a,<2*—1, that depends on A,, and indicates how 
much A, can overlap itself. (In the notation of Sect. 2, a,=(A,A,)2-) 

Let p(B,k;n) (resp. p*(B,k;n)) denote the probability that the first n coin 
tosses contain no B-run (resp. B*-run) of length k. Our proofs of Theorems 1-3 
rely on sharp asymptotic estimates for p(B,k;n) and p*(B,k;n). We show (see 
Theorem 3.1 for the precise result) that if B is of length m, these quantities are 
approximated very well by 


p*(B,k;n)~exp(—nm2-*-'), (1.5) 
p(B, k;n)~exp(—n(1—2-™)2-"). (1.6) 
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Suppose that B=b,...b,,, and let B,=b, ...b,,b,...b,_,. Then the non-exis- 
tence of a B*-run of length k is equivalent to the absence of B,-runs of length k 
for 1<r<m. If the occurrences of different B,-runs of length k were independent 
of each other, we would expect that p*(B,k;n) would be approximately 


exp(—nm(1—2-™)2-*). 


The difference between this quantity and exp(—nm2-*-') provides a rather 
precise measure of the dependencies between the B,-runs. 

The approximations (i.5) and (1.6) show that the maximal B-runs (or B*- 
runs) tend to be about lgn. However, there is no limiting distribution. It will be 
shown (see Theorem 4.1) that the averages of Z,(B) and Z*(B) have quasiperiod- 
ic components: 


E(Z*(B))=lgn—3/2+ y(log2)-!+1gm 
+v,(lgn+lgm)+o(1) as noo, 


E(Z,(B))=lgn—5/2 + y(log 2)-' +1g(1—2-) 
+0, (ign+lg(1—2~-"))+o0(1) as noo, 


where y is Euler’s constant (=0.577...) and v,(x) is a nonconstant continuous 
periodic function with period 1 and mean 0 (the same for all patterns B). Also, 


Var(Z*(B))=c+v,(lgn+lgm)+o(1) as no, 
Var(Z,(B))=c+v,(lgn+ig(l—2~-"))+o0(1) as no, 


where v,(x) is a nonconstant continuous periodic function with period 1 and 
mean 0, and 
bp na 
"12 Glogzy , oe 
where ) 
Pe 


c 


“is 2 lost —exp(—277(2k + 1)/log 2)) 


= — 1.237... x 10-12. 


(The constant c’ can be expressed in terms of Dedekind’s eta-function.) For pure 
head runs, these results had first been proved by Boyd [1]. 

Although the functions v,(x) and v,(x) are not constant, they are quite small. 
By looking at the Fourier series expansion of v,(x) (given in Theorem 4.1), it can 
be shown, for example, that |v, (x)|<1.6 x 10~° for all x. 

An interesting fact about Z,(B) and Z*(B) that can be deduced from the 
approximations (1.5) and (1.6) is that both Z,(B) and Z*(B) are much more likely 
to exceed their mean by a constant k than to be less than the mean by k. Also, 
the results of [9] show that if p,(k) is the probability that the longest head-run 
among the first n ‘elements has length k, then for any fixed n, p,(k) is a unimodal 
function of k. It is not known to what extent this is true for other types of B-runs 
and B*-runs. 
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The basic tool used in this paper is a formula for the generating function for 
the probability that any of a given set of patterns does not occur in the first r 
coin tosses, which was proved in [6]. When only a single pattern is to be 
excluded, such a formula was first presented explicitly by Solov’ev [12]. When 
several patterns are excluded, the general formula is quite complicated (see 
Theorem 2.1). However, when the excluded patterns are precisely the B,-runs, we 
have succeeded in deriving a manageable form for the generating function. 

Our basic generating function methods also allow us to deal with another 
problem raised in [11], namely that of the number of very long head runs (and, 
more generally, of B-runs and B*-runs). Let us define a sequence {n,} by 


n,=([2** ' (log k +2 log log k)], 


say, and let v,(r) be the maximal number of disjoint head-runs of length r in the 
first n, coin tosses (i.e., v,(r) is the maximal v such that there are integers 
Isl, <I,<...<l,sn,—r+1, ],+rsl;,, for 1Sisv—1, such that there are 
head-runs of length r starting in positions /,,...,/,). Then Theorem 2 assures us 
that with probability 1, 

lim inf v,(k)2 1, 


ko 


lim inf v,(k + 1)=0. 


k- co 


The results in Sect. 6 show that in fact 


lim inf v,(k)=1, 


k- co 


v,(k) - 
k-0o log k 


lim sup 


> 


- 
—. : ) mc, =4,31107..., 


k-—0o log 


tim ing A —) _ ¢, =0.37336..., 
k+co logk 


where c, and c, are the two roots of 
(Z) 
aunts =e, 
x 


, v,(k —m) 
wy logk 


and in general for any fixed meZ*, 
=c,(m), 


tim int 2@4—™ 


Se 
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where c,(m)<c,(m) are the two roots of 


(=) -2"- 
x 


It can be shown that c,(m)/c,(m)—>1 as moo. Similar results can be derived in 
the case of counting overlapping appearances of head-runs of a given length (i.e., 
when we simply look at the total number of positions at which head-runs of a 
given length start) and also for other B-runs and B*-runs. 

The fact that c,(m)/c,(m)—1 as moo reflects the fact that the number of 
appearances (overlapping or not) of a pattern A of length k in a random string 
of length n is normally distributed with mean n2~* if k is much less than lgn. 
(But the variance depends nontrivially on A.) We will not discuss this problem 
now, however. 


2. Generating Functions 


In this section we will utilize the basic results from [6] to obtain simple 
expressions for some generating functions. These expressions will then be used in 
the next section to compute the probabilities of occurrence of various repetitive 
patterns. 

We first introduce some notation from [6]. Let X and Y denote patterns 
(finite strings of symbols) over some two-element alphabet. The correlation of X 


and Y, to be denoted by XY, is a string over {0,1} with the same length as X. 
The i-th bit (from the left) of X Y is determined as follows: place Y under X so 
that its leftmost character is under the i-th character of X (from the left). Then, if 
all the pairs of characters in the overlapping segment are identical, the i-th bit of 
XY is 1, else it is 0. For example, if X =HTHTTH and Y=HTTHT, then XY 
=001001, as depicted below: 


X: HTHTTH 
Y: HTTHT 
HTTHT 
HTTHT 
HTTHT 
HTTHT 0 
HTTHT 1 


Note that YX =00010, so that in general X Y + YX. It makes sense to define the 
autocorrelation of X as XX. Thus for the Y above, YY=10010. XX is a 
representation of the set of periods of X, i.e., those shifts that cause X to overlap 
itself. The question of characterizing those binary patterns that are correlations 
for some patterns is dealt with in a separate paper [7]. 

We often wish to interpret the correlation X Y as a number in some base ¢, or 
else a polynomial in the variable t, in which case we write XY,. Thus, for the 
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above example 
XY,=t?+1, XY,=9. 


Two more final points of terminology: we write |X| for the length of X, with |X| 
=0 if X is the empty string, and we call a set of patterns {A, B,..., Y} reduced, if 
I is never a substring of J, for any two patterns J, J in our set. 

Suppose that {A,B,...,T} is a reduced set of patterns. Let f(n) 
= f(A, B,...,T;n) denote the number of strings of length n over our alphabet 
that do not contain any of A, B,...,T. We denote by F(z) the corresponding 
generating function 


F(2)= y fiz 


Let f,(n) denote the number of strings of length n that end with H and do not 
contain any of A, B,...,T except for that single appearance of H at the end of 
the string, and let F,,(z) be the generating function of f,,(z). The basic result of 
[6] is the following system of equations: 


Theorem 2.1. If {A,...,T} is a reduced set of patterns over an alphabet of size 2, 
then the generating functions F(z), F,(z),...,F;(z) satisfy the following system of 
linear equations : 

(z—2) F(z)+2zF,(z)+2zF,(z)+...+2F,(z)=z 


F(z)—zAA, F,(z)—zBA, F,(z)—...—ZzTA, F,(z)=0 (2.1) 


F(z)—zAT, F,(z)—zBT, F,(z)—...—2TT, Fy(z)=0. 


It is shown in [6] that the system (2.1) is always nonsingular, so that one can 
solve for F(z), F,(z),...,F,(z) as rational functions of z. If we exclude only a 
single pattern A, the solution is very simple: 


zAA 1 


F(z) se FAO=TG-D AA.” 


~ 1+(2—2)AA,’ 2.2) 


Theorem 2.2. Suppose that B is a nonperiodic pattern, |B|=m, and that A 
= BB... BB’, where B’ is a prefix of B,|A|=k. Then the enumerator F(z) of the 
strings that do not contain A has the form 


zf (2) 


FO)=TT6-D fo’ 


(2.3) 


+m—1 


f= + 


1 2z™-1’ 





(2.4) 


and q(z) is a polynomial of degree <2m, and the coefficients of q(z) are bounded 
by 2 in absolute value. 


Proof. It is easy to show [7] that A has no periods <k—m that are not 
multiples of m, whereas all multiples of m are periods of A, so 


AA,wet~) 2-9" * +... eee? +p), 
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where deg p(z)<™m, and the coefficients of p(z) are 0 or 1. Together with (2.2) this 
immediately yields the theorem. 


Theorem 2.3. Suppose that B is a nonperiodic pattern, |B|=m, and that A 
= BB... BB’, where B’ is a prefix of B,|A|=k23m. Then the enumerator F(z) of 
the strings (over the same two-element alphabet that forms B) that contain neither 
A nor any of the cyclic shifts of A (i.e., which contain no B*-run of length k) has 
the form 

zik+ 1)m+1 (z™— i* +4q,(z) 
(z—2) z+ ihm (zm _ aa +4q,>(z)’ 





F(z)= (2.5) 


where q,(z) and q,(z) are polynomials in z of degrees <k(m—1)+c,, with 
coefficients that are Sc, in absolute value. Here c, and c, are constants that 
depend only on m, but not on k. Furthermore, 


q,(2)=O(2k™"—), (2.6) 
qo (2)=m2*"—k+m(p™— 1)™—1 4 O(2K™—2)), (2.7) 
where the constants implied by the O-notation depend only on m. 


Proof. Suppose that B=b,...b,,, and define patterns A(1),...,A(m) of length k 
by 
A(r)=b,,_,41 «+ Dm BB... BB’, 


where B’=B’'(r) is a prefix of B of the appropriate length. The generating 
function F(z) counts strings which contain none of A(1), ..., A(m). 
If 1<r<s<m, then we easily check that 





gk-1 ~§+"+-q. (2) 
z™—1 , 


A(r)A(s), = 


where q, ,(z) is a polynomial of degree <2m—1 with coefficients that are <2 in 
absolute value. Similarly, if 1<s<r<m, then 


gktr—-s-1 tm 4g. (2) 


A@A(),=—_, 2, 





where q, ,(z) satisfies the same properties as in the case 1 <r <s<m. By Theorem 
2.1, F(z) can be obtained by solving the system of equations 


M(F(2), F4(1)(2), tees Fuim) (z))"" =(z, 0,0, ... , 0)", (2.8) 
where 
222.32 


ue * (2.9) 


1-z 


with M’'=Q+C, where Q and C are mxm matrices with polynomial entries. 
The (r, s)-entry of Q is zq, ,(z), while C is the circulant matrix whose first row is 
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sides gktm—1 ig. f**) 
We will solve (2.8) for F(z) by Cramer’s rule; 


_ det N 


ve det M’ 


(2.10) 


where N is obtained from M by replacing the first column of M by (z,0,...,0)", 
so that 
det N =z det(Q+ C). (2.11) 


We will evaluate det (0+ C) later, after we dispose of the harder problem of 
evaluating det M. 
To evaluate det M, consider first 
z 


1-—z 
M,=M,(z,y)=| . M, | (2.12) 


7 ae OO 
m 


1-2" 
where M; = M; (z, y)=Q+C,, Q is as before, but C, is now the m xm circulant 
matrix with first row equal to 


1 


(902... eh 


Note that M=M, (z, z"). Expand det M, (z, y) as a polynomial in y; 


m 
det M, (z, y)= » a,(z) yi, 
J= 
Since the polynomials that are elements of M, all have degrees <2m in z, and 
all the coefficients are bounded by 2 in absolute value, each a,(z) has degree 
<2m(m+1), and has coefficients bounded in absolute value by 4"*'(m+1)! 
=O(m?"). (These estimates can be improved significantly through more careful 
analysis.) 


We next investigate a,,(z) and a,,_ ,(z). Since y appears only in M‘,, and only 
to the first power, we see that 


a,,(Z) y"=(z—2) det C,. 


To evaluate det C,, subtract z’~* times the first row of C, from the r-th row. 
This gives ss I - 
yz yz ion yz 
yz(i—z”) 0 seer 0 
detC,=det} yz™(l-—z") yz(l-—z")... 0 
0 
y2*-1(1—2*) “sie yz(1—z”) 0 


=(—1)""'yz[yz(i—2")]"—*, 





Long Repetitive Patterns in‘Random Sequences 


so that 
A(Z) y" =(z—2) z™(z™—1)"-! y™. (2.14) 


We next evaluate a,,_,(2). As a first step we claim that a,,_,(2) is inde- 
pendent of the polynomials q, ,(z). To prove this, write M, =(m,,), 0Si,j<m, 
and let o denote any permutation on {0,1,...,m}, so that the term correspond- 
ing to o in the expansion of det M, is 


b,=+ Mj (i) 
i 


If the coefficient of y"~* in b, depends on q,,; for some r,s, 1 <r, s<m, then we 
must have o(r)=s and, moreover, we must have o(i)e{1,...,m} for 1 <Sism. But 
then o(0)=0, b, is divisible by z—2, and so b, vanishes at z=2. This proves the 
claim. Using this result, we can therefore write a,,_,(2) y"~'=det M,(2, y), 
where 

a2 2.08 

1-2" 

M,=M,(z, y)= : e, 

1-—z 
To compute det M,, subtract y~' times the second row from the first row, and 
then, for 3<r<m+1, subtract z’~? times the second row from the r-th. 
Expanding by the last column, we find that 


det M,=(—1)"~! yz det M,, 


z—2+(z"—1)/y z—z™ z—2™-!... 2—2? 
(z™—1)(z—1) 
M,;=| (z™—1)(z?—1) U 


\(z"— 1) (z"-4 = 1) 


where U=(u;)) is an (m—1)x(m—1) matrix with u;;=0 for j>i and u,; 
=yz't!-J({—2") for 1<j<i<m-—1. Subtracting y~'(1—z)(1—z2")~! times the 
sum of the second through m-th rows from the first row of M, transforms M, 
into a lower-triangular matrix with diagonal (S, yz(1—z™), yz(1—z”),..., 
yz(1—z™)), where 


S=z-2+y-"(e"=1)- yy -2y"" YE) 


r=2 


=z—2+m(z—1) y~'. 


Therefore 
det M, =(z™—1)™-! 2™ y"-! {(z—2) y+-m(z—1)}, 


a, (2) y"—* =m2(2"—1)"—* yert. 


(Note that the above result provides another evaluation of a,,(z) as well.) 


so that 
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We next evaluate det N. If we let 


N,=N,(z, yy=Q+C,, 
then det N =z det N,(z, z"). Now 


det N,(z, y)= y bz) y! 


for some polynomials b,(z) with degrees <2m(m+1) and coefficients that are 
O(m?™). Furthermore, 


b,,(Z) y"=det C, =y™2™(z™— 1y"-1 


by (2.13). If we now combine our evaluations of det N and det M with (2.10), we 
obtain Theorem 3. 


3. Asymptotic Estimates 


In this section we will use the results of Sect.2 to obtain asymptotic estimates 
for the probabilities p(B, k; n) and p*(B, k; n), when B is a nonperiodic pattern 
of length m. All our estimates will assume that B is kept fixed while k and n vary, 
and the constants in the O-notation will, in general, depend on m, but not on k 
and n. Our exposition here will be rather brief, since the basic methods used are 
rather standard. (Very detailed proofs by similar methods are given in [5].) 


Theorem 3.1. Let B be a nonperiodic pattern of length m. Then 
p*(B, k; n)=exp { —nm2-*-! + O(nk? 2-7* + k2-}, 
p(B, k; n)=exp { —n(1—2-") 2-* + O(nk? 2-7* +. k2-*)}. 


Proof. Let f (2) =(z—2) 2+) (zm 1)"-1 4.g, (2), 


where q,(z) is given by Theorem 2.3. Since 
F(2)=f (2) +(z—2) f'(2)+ O((z—2)? k? 2") 
for 1<z<2, we deduce that f(z) has a zero p with 
p=2—m2-*+ O(k? 2-7") 
=2 exp {—m2-*-1 + O(k? 2-74} 
for k2k,(m). On the circle |z|=3/2, 
la2(z)| = OG)" ) <|(2—2) 244 Y™(z"—1)"-| 


if k>k,(m). Therefore by Rouché’s theorem, f(z) has only one zero (namely p) in 
|z|=3/2. Let « denote the residue of F(z) at z=p. Then 
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phtim+l(pm_1ym-14 9 (9) 
f'(e) 
=2(1 + O(k2-"))=2 exp {0(k2-)} 


e= 





for k>k,(m). 
We now consider é 
F(z) —-——. 
z—p 
This function is analytic on |z|=3/2, and is O(1) on |z|=3/2. Hence the 
coefficient of z~" in its Taylor series expansion around z=0o is O((3/2)"), and 
therefore the coefficient of z~" in the expansion of F(z) is, for k2k,(m)=max 
(k,(m), k,(m), k,(m)), 
a p"~* + 0((3/2)") 


=2" exp {—nm?-*-1 + O(nk? 2-2*+k2-} + O((3/2)"). 
But for n210k, k=k,(m), this is 


2" exp {—nm2-*-! + O(nk? 2-7*+k2-*)}. 


Since this estimate also holds trivially for n<10k, k2k, as well as for all n and 
k<k.,, (although with the constants implied by the O-notation being larger) this 
proves the first part of the theorem. The second part follows by a similar, but 
easier argument, which now relies on Theorem 2.2 rather than 2.3. 


4. Periodicities in Distributions 
In this section we use the asymptotic estimates of Theorem 3.1 to show that the 
distributions of the lengths of maximal runs exhibit quasi-periodicities. 
Theorem 4.1. Assume that B is a nonperiodic pattern of length m. Then 
E(Z*(B))=lgn—3/2+ (log 2)-'+1lgm 
+v,(lgn+lgm)+O(n-! (log n)?), 
E(Z,,(B))=lgn—5/2+ p(log 2)-' +1g(1-2-™) 
+v,(igx+lg(i—2-"))+ O(n-* (log n)’), 
be 2ni 
v(x = : 2% r (-a5 *) exp(27irx). 
r+0 
Var(Z*(B))=c+v,(lgn+lgm)+O(n-' (logn)*), 
Var (Z,(B))=c+v,(Ign+lg(1—2-"))+ O(n— ‘(log n)*), 


where v,(x) is a nonconstant continuous function with period 1 and mean 0, and 


1 n? 


2 
“=79* Glog 2)? * log 2 


log Tl (i- e~ 29*2k+ 1)G082)-") 
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Proof. By Theorem 3.1, after some manipulation we obtain 
E(Z3(B))= ¥) k(p*(B, k +1; n)—p*(B, k; n)) 


k=0 
= y k {exp(—nm2-*-?)—exp(—nm2-*-1)}+O(n—' (log n)?) 
k 


=lgn+u(m, 1, lg n)+ O(n- ‘(log n)?), 
where, for heZ, h=0, a>0, 


u(a, h, x)= y (k—x)" {exp(—a2*-*-?)—exp(—a2*-*-1)}. 


k= —©o 


u(a,h,x+1I)= Y (k—1—x)* {exp(—a2*-*-)-7) 


k= —oo 


—exp(—a2*---1)} =u(a, h, x), 


so u(a, h, x) is periodic in x with period 1. To investigate the variance of Z*(B), 
we note that 


B(zzB¥)= Y k?(p*(B, k +1; n)—p*(B, k; n)) 


= : k? {exp(—nm2-*-?)—exp(—nm2-*-')}+ O(n-! (log n)*) 
k= —0o 


=2 E(Z*(B)) lgn—(lgn)? + O(n—' (log n)*)+u(m, 2, lg n) 
= E(Z*(B))* +u(m, 2, lg n)—u(m, 1, lg n)? + O(n- ‘(log n)*). 


Thus we have reduced our problem to that of computing the Fourier series of 
u(m, 1, x) and u(m, 2, x). But if 


ufa,h,x)= Y cule) e***, 


r= —oo 


then, by the uniform convergence of the series defining u(«, h, x), we obtain 


1 
Cz, Mr) =f u(a, h, x) e~?*"* dx 
0 


= Yo fj (k—x)* {exp(—a2*-*-?)—exp(—a2*-*-')} e-?*!"* dx 
peepee 


oo 


=(—1)" § x" {exp(—a2*-?)—exp(—a2*-!)} e-2*#* dx 


Me 
=(2ni)~" °F Cz, o(S) = 
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where we now define 


co 


cz,o(8)= § {exp(—a2*-2)—exp(—a2*-1)} e- 2! dx 


—0o 


for any real s. But if we make the change of variables x=lg y, and set B 
=2nis(log2)~', then 


Cu, o(8)=(log 2)? § (e-2”/* —e-*/) y-F-! dy, 
0 


=a(B log 2)~! f (; e-avi2_} 4 -ayiay y~*dy 
rte 4 


=a(Blog2)-! ‘3 Gy r-B)-] ey ru-ph 


2nis log a 
=(log2 -1 ,-2nis —2nis_ 4 rT (- ) (2 : ) 
(log 2)~* e (e ) log? exp |2zis log2/” (4.1) 
at least for s+0. Hence for r+0, reZ, 


1 2nir log « 
pi ee 2ni ; 
Ca, (0) log 2 ( log 5) =P ( ee og 5) 


The values of c, (0) are obtained from (4.1) by letting s—0 in that formula, 


since the pole of the gamma factor is cancelled by the zero of exp(—2zis)—1. in 
particular, we obtain 


C,, (0) = — 3/2 + p(log 2)—' +1g a, 
Cy, 2(0)=7/3 +n? 6—' (log 2)-? + y? (log 2)-7—3 Ig a + (Ig x)? +2 p(Ig x) (log 2)-? 
—3 y(log 2)-?. 


Now the constant term in the Fourier expansion of u(a, 2, x)—u(a, 1, x)? is 


co 


Ca, 2(0)— 2 Ic, 1(r)|? 


tia 2 Ne 
2 fea ha 
1 n? | Soa = 1 
=12t 6(log2)?  log2 ,&, rsinh(an)’ 
where a=2n(log 2)~'. Collecting all our estimates we obtain the claim of the 
theorem. 


The analysis for Z,(B) is almost identical, except that we work with 
u(2(1—2~-™), h, x) instead of u(m, h, x). 
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5. Long Runs in Infinite Sequences 


In this section we prove Theorem 3. Proofs of theorems 1 and 2 follow the same 
pattern and will be omitted. 

Suppose that we are given a sequence n,, k=1, 2,..., of positive integers such 
that 


co 


exp(—n,m2-*-!)< oo. (5.1) 
» k 


k=1 
We wish to show that Ss 
> p*(B, k; n,)< 00. (5.2) 


k=1 


By Theorem 3.1 it will suffice to show that for any constant c>0, (5.1) implies 
that 


fo @) 


Y exp(—n,m2-*-! +n, k? 2-7") <0. (5.3) 
k=1 


If n, >2**? (log k)?, then 
n,m2-*-!—cn, k?2-7*>(log k)? —c’, 


where c’ is some constant depending on c and the n,, and so the sum of the 
terms in (5.3) corresponding to these values of k converges. If n,<2**?(log k)’, 
then cn,k? 2-?*<c", and again the sum of these terms in (5.3) converges if (5.1) 
holds. Hence the entire sum in (5.3) is finite, which proves (5.2). But now, by the 


Borel-Cantelli lemma, for almost all sequences w we will have Z* (B)<k only 
finitely often, which proves the first part of Theorem 3. 
The proof of the second part of Theorem 2 is considerably more involved. 


Suppose that we are given a sequence n,, k=1,2,..., of positive integers such 
that 


Y exp(—n,m2-*-")=00. (5.4) 
k=1 
We wish to show that for almost all sequences, Z* (B)<k occurs infinitely often. 
We will use a version of the Borel-Cantelli lemma [10, p. 391]: 


Lemma. Let S be a subset of the positive integers, S,=Sq{1,...,r}, and let A;, 
ieS, be arbitrary events such that 


Yy. Pr(A)= 0 (5.5) 


ieS 


Y Pr(4;A) 
lim inf £8 (5.6) 


reo (5. Pr(A))? 


ieS, 


Then, with probability 1, infinitely many of the events A, occur. (Note that the 
limes inferior above is always = 1.) 
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We will apply the above lemma with A, denoting the event Z* (B) <j. Since 
the sum of Pr(A;) over those j with n; >2) +2 (log j)* converges, we can disregard 
such j. Let us denote by S’ the set of j with 


n,<2/*? (log j)’. (5.7) 
If j Sk, but nj2n,, and j, keS’, then 


Pr(A,A,)= Pr(A,)=exp {—njm2~4—' +e; ,}, (5.8) 
where 


ej, .= O(n, j° 2-74 +j2-)=O(P 2-4) (5.9) 
by Theorem 3.1. If j<k and n,;<n,, then P(A,A,) is bounded above by the 


probability that the first n, elements contain no B*-run of length j and the next 
n,—n, elements contain no B*-run of length k; ie., 


Pr(A,A,)<Pr(A,): Pr(Z* _,, (B)<k) 


Sexp {—n,m2-/-'—(n,—n,) m2-*-1 +e, ,}, (5.10) 
where for j, keS’, 


ej .=O(n;j 2-J4+n,k? 2-7*+j2-)=O0(j? 2-4. (5.11) 
Next we show that we can assume that 
a, =exp(—n,m2-*-')30 as ko. (5.12) 


If this were not the case, then we could find an infinite sequence k,, i=1,3,..., 
such that k;eS’, 
m2-"-1<¢ 


for some constant c, and k;, ,>ik;. Choosing S={k,;} we would then find from 
(5.8) and (5.10) that the hypotheses of the lemma are satisfied, which would then 
prove the theorem. 

We can thus assume that (5.12) holds. We choose S to consist of those 
elements of S’ for which a, <1/10, say. Then 


y Pr(4;A,)S2 ¥ Y Pr(AjA)s2 ¥ YY exp{—n,m2-4-*+ C7? 2-4} 


j, keS, keS, jeS, keS, jeS, 
isk isk 


nj=m 


+2 >) > exp{—n,m2-*-!—n,m2-4-! +n,m2-*—* + Cj> 2-4, (5.13) 
keS, jeS, 
j<k 


where C is a fixed positive constant. Now for keS, 


Dd exp{—njm2-J-'+ Cj? 2-4} 
jeS, 
isk 


=0(}" exp(—n,m2~~')) 
isk 
=O(a, +02 +af+...)=O(a,), 
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so that the first double sum on the right side of (5.13) is 


O(Y a). 


keS, 


To analyze the second double sum, choose any de(0, 1/10). Then for keS, 


T= > exp(—njm2-5-' +njm2-*-1 + Cj? 274 
ie 
nj <M 
nj <d2* 


< ¥ exp(—n,m2-J-! + dm+ Cj*2~4 
iS 
=(1+0(6)) }) exp(—njm2-4-* + Cj? 2-4), (5.14) 
ers 
where, as before, the constant implied by the O-notation may depend on m, but 
is independent of r, k, and 6. On the other hand 


T= ) exp(—njm2-3-' +n,m2-*-* + C7? 2-4 
iss: 
nek 
=0( » ai” ")=O(>. exp(—dm(2*-J-1 —2-})) 
jeS, j<k 


Jj< 
nj262* 


=0 (> exp(—3mh/2)) =0(5-1) 
=1 


Combining (5.14) and (5.15), we obtain 


d Pr(A;A,)SO | > 5" a,) +2(1 +006) Yao, cP 2 


i, keS, eS, j, keS, 
j<k 


Let J be such that exp(Cj*2~/)<1+6 for j2>J. Then the second sum on the 
right side above is 


<(1+6)? ( y Pr(A)))? +0( > Pr(A;))+ O(1). 
jeS, jeS, 
j2J 


¥ Pr(A;A,)S(1+0(8) (¥ Pr(A))?+0(5-! Y Pr(A,)). 


i keS, jeS, keS, 


Therefore 
a Pr(A; A,) 
lim inf =} —_—_____ <1 + Of). 
r—0o (> Pr(A,))? 


jeS, 


Since this holds for every 5€(0,1/10), the theorem follows. 
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6. Counting the Number of Runs 


In this section we will consider the problem of estimating the number of runs of 
a given type and length in random sequences. There is a choice to be made here; 
we can count either the total number of appearances of any of a specified set of 
patterns, or else the maximal number of nonoverlapping ones. (In the first case, 
we would say that HHHHH contains 4 appearances of HH while in the latter 
case we would count only 2.) In either case, asymptotic estimates can be 
obtained from the generating functions 


F(2)= ¥ fe" 


where f,(n) is the number of strings of length n that contain exactly r (nonover- 
lapping, or overlapping, as the case may be) appearances of patterns from a 
specified set. The case when we count the total number of appearances wes 
considered briefly in [6]. It follows from formulas (2.4) and (2.5) of [6], for 
example, that if we are interested in the total number of appearances of a given 
pattern A, |A|=k, then for r=1, 


(z—2)(AA,—2*-')y-! 


= 
leer (1+(2—2)AA)y*! 





This of course enables us to analyze B-runs. Other results of this kind can be 
derived for B*-runs. 

In this section we concentrate on the other problem, namely that of counting 
nonoverlapping appearances. 


Lemma 6.1. Let J be any reduced set of patterns over some finite alphabet, and 
suppose that f,(n) denotes the number of strings of length n over that same alphabet 
for which the maximal number of nonoverlapping appearances of patterns from 
is equal to r, and that g(n) denotes the number of strings of length n that contains 
one appearance of a pattern from . in the initial position, but contain no other 
appearance of a pattern from & that is disjoint from or overlaps that initial one. 
Let 


G(a)= Yanan" 


E(n)= finde 


Then for r=1, 
F(z) =Fo(z) G(2y. 


Proof. Any string X of length n that contains a maximum of exactly r 
nonoverlapping occurences of patterns from . can be written uniquely as X 
= YZ, where Z starts off with an initial appearance of a pattern in o/, but 
contains no other patterns from J anyplace, and Y contains a maximum of 
exactly r—1 nonoverlapping appearances of patterns from ./. Moreover, for any 
Y and Z with these properties, X=YZ contains a maximum of exactly r 
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nonoverlapping appearances of patterns from .. Hence 


f=» fr i(n—k) g(k), 


and so 
E(z)=F__,(2) G(2), 
which proves the lemma. 
If the set . consists of a single pattern . over an alphabet of size 2 then 
(2.2) yields 
zAA 1 


Fol2)=7 CO)= 15 G-DAA, 


(z—2)AA,”’ 
so for r=0, 
zAA 


h)= 75 @-DAAy 





(6.1) 


This simple expression for F(z) makes it possible to obtain good estimates for 
fAn). Depending on the relative sizes of k, r, and n, different methods are 
necessary. The following result is most useful when r and k are relatively small, 
say kr? <n'~* for some fixed e>0. 


Theorem 6.2. If A is a fixed pattern over an alphabet of size 2, |A|=k, and f,(n) 
denotes the number of strings of length n from the same alphabet in which the 
maximum number of nonoverlapping appearances of A equals r, then for 
kr? <n/10, we have 


n 


fAn) =" (7J exp (-,7.+ O(rk2-*+nk4-*+ kr?/n)) : 


Proof. It can be shown [5] that 1+(z—2)AA, has exactly one zero p with 
|p| = 1.7 for |A|=k large enough and this zero satisfies 


1 
p=2—-— +0(k4-*). 
AA, 
Therefore, if we write 
r+i1 


F(z)= ) ay(z—p)~" +h(2), 


m=1 


then h(z) is analytic for |z|= 1.7, and therefore 


r+l1 n—-1 
f= ¥ ay ("— ) pt +b (0, 
aes i—1 
where 
1 
b,(n)=— F(z)z"-'dz. 
2ni ee @) 


But since 


|1+(z—2)AA,|Ze(1.7' for |z|=1.7 
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(see [5]), where e>0 is a fixed constant, 
b(n) = O(e-"(1.7)"-*")=O()"—). 
We next estimate the a,,. Write 


1+(z—2)AA, =c(z—p){1+(z—p)u(z)}. 
Then 
c=(p—2)AA’,+ AA, =AA,(1+0(k2~), 
and 
u(z)=O(k) for |z—p|Sk~', 


say, by Taylor’s formula with remainder. 
With this notation, 


pAA tal _ 
, 4.1 = ye = (AA) exp(O(kr2~)), 


while for 1<m<r, 


§  E@)(e—p)""'dz 


2ni pecan: 
“4 kr 


=O(k-™—"— max | F,(z)I) 


Is—sl=—— 
=O(c~"~ likry'* 1-—m 2") 
=0((AA,)~"(kry'* 1 —m oO(kr2 ~"9), 
If we now combine all of the above estimates, we obtain the theorem. 
The results of Theorem 6.2 (as well as similar ones that can be derived for 
B*-runs) enable one to study the frequencies of various kinds of repetitive runs. 


We will now use Theorem 6.2 to prove the results about head-runs that are 
stated in the last part of the Introduction. As in the Introduction, let 


n, =[2**! (log k +2 log log k)]. 


If we let A be a head-run of length k, then by Theorem 6.2 the probability that 
the first n, coin tosses contain <1 appearances of A is 


as k-> oo. 


sii 
klogk 


Therefore an argument based on the Borel-Cantelli Lemma (similar to that of 
Section 5) shows that with probability 1 there are infinitely many k for which 
v,(k)=1, which proves (1.7). (Notation here is the same as in the Introduction.) 
The probability that the first n, coin tosses contain2(e+6) logk disjoint 
appearances of A is 

0(6-'k-* (log k)~?), 
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and since the sum of these converges, the Borel-Cantelli lemma shows that with 
probability 1, 

v,(k) 


lim sup Se+o. 


Since this holds for all 6>0, we obtain (1.8). The other estimates of the v,(k—r) 
are obtainable by analogous arguments. 
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Introduction 


In this note, we consider the existence and some basic properties of predictable 
and dual predictable projections of stochastic processes in the plane. The results 
presented here are an adaptation of an earlier version of this note which was 
written before the result of Bakry [1] that every two-parameter L? martingale 
possesses a cadlag version. In the earlier version, predictable projections were 
introduced by projecting successively first with respect to the one parameter and 


then with respect to the other parameter. It was shown that the result is a 
predictable process, but it was not known wether these successive projections 
commute and consequently the predictable projection should not be unique. In 
this version, we prove the unicity of the projection following a lemma of 
selection for predictable sets, and the results are also based on the works of 
Doléans and Meyer [5], and Meyer [7]. Finally, we point out that the results 
are applicable directly to the extension of the definition of stochastic integrals in 
the plane. 


We wish to thank P.A. Meyer for calling our attention to some serious errors in a previous 
version of this note. 


Notation 


The notation and definitions of this note will follow those of [3]. For two points 
z=(s,t) and z’=(s’,t’) in the positive quadrant of the plane R2, z<z’ means 
ss’ and tSt’, and z<z’ means s<s’ and t<v?’. 

If z<z’, (z,z’] will denote the rectangle (s,s’] x(t,t’]. Let (Q,F%,P) be a 
complete probability space with a right-continuous filtration {F,,z¢IR? } sat- 
isfying the following property (the (F4) property of [3]): Let Fj,,=F,,,.) and 
F2 =F..,1, then for each z, F,' and F are conditionally independent given 
F,. A set in the product space R2 x Q is called measurable if it belongs to the o- 
algebra B(R2)@F. 
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Consider the measurable space (IR? x Q, @(IR2)@F). Observe that finite 
unions of the following “rectangles” ((z,z’] x G), GeF, (respectively Ge F, i 
= 1,2), constitue an algebra. A set is called predictable (i-predictable, i=1, 2) if it 
belongs to the o-algebra generated by these “rectangles”. This o-algebra will be 
denoted by # (respectively F', i=1,2). A process will be called predictable 
(respectively i-predictable) if it is A (respectively P’) measurable. Finally, recall 
the definition of an increasing process [3]: A process A={A,,zeIR?} is an 
increasing process if A vanishes on the axes, is right continuous, adapted, 
sup, EA,<o and for every rectangle [z,z’], z<z’ satisfies 

A(z,2'J=A,+A,—A, »—Ay,,29. 


s,t= 


Predictable and Dual Predictable Projections 


Let X,=al(z,z’], CeIR7., where « is a bounded random variable, and let a, 
denote the left continuous version of the (essentially one-parameter) martingale 
E(a|F,'). Then I'X,=a,](z,z’], CelR4, is obviously in #'. The process 
{I1’ X,, CIR? } is called the 1-predictable projection of X, for the particular case 
where X,=aI(z,z’]. It was shown by Meyer [7] that the notion of i-predictable 
projections (i= 1,2) can be extended to all bounded and measurable processes as 
follows: Let X be a bounded process then there exists a unique bounded i- 
predictable process /T'X, (i=1 or 2), such that for all integrable i-predictable 
increasing processes A the equality E { XdA=E {(II'X)dA holds. 


Remark. The extension of JI'X to all bounded and measurable processes can 
also be done by the same arguments as in the one parameter case by applying 
an i-section theorem for two parameter processes which was derived in [6]. 


Lemma 1. (a) Let X be a simple 1-predictable random process (i.e. X,=aI(z,z'], 
CeIR? where « is an ¥,' measurable and bounded random variable), then II? X, the 
2-predictable projection of X is a predictable process. (b) If {Y,,z€IR2 } is bounded 
(or positive) and 1-predictable, then its 2-predictable projection I” Y is predict- 
able (and therefore if Y is both 1 and 2-predictable then Y is predictable ). 


Proof. II” X,=E(a|F;7 ) I(z,z']. Since « is F' measurable, it follows by the (F4) 
property that [17 X, is F, adapted and left continuous. The fact that adapted left 
continuous processes are predictable follows by the same argument as in the one 


parameter case ([4], p. 78). This proves (a). Part (b) follows from part (a) by the 
monotone classes theorem. 


Corollary. The o-fields P', P?, P satisfy: P=P' AF”. 


Proof. If Y, is both 1 and 2-predictable, then by Lemma i, Y, is predictable and 
therefore P>(P7! AF). The inverse inclusion is obvious, therefore P=P7'! nF?. 

The following lemma summarizes two simple results related to IT‘ which are 
needed later. X will denote a bounded or positive measurable process in IR? 
and A will denote an increasing process. 
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Lemma 2. (a) [7' X,=E(X,|F,_) as. 
(b) If for all X we have E{ X -dA=E{II'X4dA, then A is one-predictable. 


Proof. The proof of (a) is the same as in the one-parameter case (cf. [4], VT15), 
and is, therefore, omitted. Turning to (b): Let X, ,=0 for t>t, and X,,,=X, ,, 
for all t,,t,<t). Then, by the corresponding one-parameter result (cf. [4], 
VT26), it follows that (A, ,,), is 1-predictable. Therefore, (A, ,), is 1-predictable 
for, say, all rational t. Therefore, since A is right continuous, A is 1-predictable. 

A real o-additive measure p on (IR?_ x Q, A(R? )@F) such that P-evanescent 
subsets of IR? x Q are y-null is called a stochastic measure. If A is an increasing 
process (bounded variation process), then A induces a (signed) stochastic mea- 
sure 41, defined on all the measurable sets in the following way: If X is a positive 
measurable process, then y,(X)=E[{ X,dA,]. Conversely, if u is a stochastic 
measure defined on all the measurable sets, then there exists a unique increasing 
but not necessarily adapted process A such that »=y,. The proof is exactly the 
same as that in the one-parameter case [4] and is therefore omitted. 

A stochastic measure is said to be predictable if the corresponding increasing 
process is predictable. 

Let A={A,,zeIR*} be an increasing process. By the results of [5], there 
exists a unique increasing process A*‘, i=1,2 such that 


EjM'XdA=E({XdA"; i=1,2 (1) 


for all bounded and measurable process X. The process A” is i-predictable and 
is called the dual i-predictable projection of A; A—A™ is an i-martingale. 

In fact, the existence and the unicity of A™ follow directy, as the one 
parameter use, looking at the stochastic measure yp defined by: 


u(X): EL 11'X dA). 


Definition. Let A ={A,,z€IR2 } be an increasing process. The process A* =(A*') 
is called the dual predictable projection of A. 


Proposition 1. A* is a predictable process. 


Proof. Following Proposition 5 of [5], if each bounded martingale have left 
limits then A” is 1-predictable and 2-predictable. In [1], Bakry proved that it is 
the case. Then we can conclude following the above Corollary. 


Lemma 3. Let L be a predictable non-evanescent set. Then there exists a 
predictable stochastic measure 1 supported by L such that: 


H(IR2 x Q)=p(L)>0. 


Remark. For the one parameter case, this result is a trivial consequence of the 
section theorem for predictable sets. It is useful for establishing the uniqueness 


of predictable processes possessing certain properties. We will refer to Lemma 3 
as the “selection lemma”. 


Proof. The proof follows along the same lines as the proof of Proposition 2 of 
[7]: Following the classical section theorem, there exists a measurable selection 
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| of L into R? U{co} such that (I(w),@)eL for almost every w in the projection 
of L on Q (and otherwise [(@)=+ 00). Define the increasing process non 
necessarly adapted A,(@)=I,,.)..,. Then: 


E{1,dA,= Prob {Projection of L on Q} +0. 


Let D,= J I,dA® and yp the stochastic measure induced by D. Following 


Proposition t Lt is predictable and the statement of the Lemma follows. The 
following corollary follows immediately: 


Corollary. If L is a predictable set and E{1,dA=0 for all predictable increasing 
processes A, then L is an evanescent set. 


Proposition 2. Let x={X,,z€IR?} be a bounded and measurable process. Then 
there exists a unique predictable process I1X such that: 


Ej X,dA,=E{I1X,dA,, 


for all predictable increasing process A. The process I1X will be called the 
predictable projection of X. 


Proof. Let X,=a1,,,., C€IR?., where « is a bounded random variable, and let « 

denote the left-continuous version of the bounded martingale E(a|F,_) [1]. The 
process [1X,=«,I,, .,, is predictable and is called the predictable projection of 
the process X. Now, if X, and X, are two simple processes such that X,<X,, 
then [7X ,<II1X, and we can use an argument of monotone classes to extend 
the definition of predictable projection to all bounded and measurable pro- 
cesses. Uniqueness: Suppose that Y is a predictable process such that Ej YdA 
=E|IIX4A for all predictable increasing process A. Set L, = {I1X > Y} and L, 

= {I1X < Y}. These sets are predictable and following the silection lemma, they 
are evanescent. 


Proposition 3. For any bounded and measurable process X, we have I1X = II? II' X 
= {T' II’ xX. 


Proof. Let A be a predictable increasing process. Then 


Ej XdA=E{II'XdA=Ef I? II'XdA 
and 
Ej XdA=E{ II? XdA=Ef{I' II’? Xd. 


By Lemma 1 the processes J]7J1'X and JI'II?X are predictable. It follows 
therefore from Proposition 2 that 7X = [17 IT’ X =II' II” X. 


Proposition 4. Let A be an increasing process and yu be the stochastic measure 
induced by A. Then A is predictable iff 1 and IT commute; that is: if X, and Y, are 
measurable processes such that I1X,=ITY,, then u(X)=p(Y). 


Proof. If A is predictable then p(X)=p(Y) by Proposition 2. Conversely, 
consider all pairs (X,Y) such that Y=JI'X, then I7TY=I1X. Therefore p(X) 
=(Y) implies, by part (b) of Lemma 2, that A is 1-predictable. Similarly, 
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considering all pairs (X, Y) such that Y= IT? X, then J7' IT? X =J1' I” Y. Therefore 
u(X)=u(Y) and part (b) of Lemma 2 imply that A is 2-predictable. A, being 1 
and 2-predictable is predictable by the corollary of Lemma 1. 


Proposition 5. Let X ={X,,z€IR* } be a bounded and measurable process and F,_ 


=\V/ F,, then 
C<z 


(11X),=E[X|F,_] as. 
and 


(a) if X is adapted then for any two points z<z’, 
EU NX(z, 2’ WF_J=E[X(z,z FL), = i=1,2. 


b) If M, is an F, martingale, then IIM, is an ¥,_ martingale. 
(c) If M, is an ¥, weak martingale, the IIM, is an ¥,_ weak martingale. 
(d) If M, is an adapted i-martingale, the IIM, is an adapted ¥; i-martingale. 


Proof. Let z be a fixed point in R* and HeF,_. Let 


1 z<¢C and weH 
0 otherwise. 


A,(@) -| 


This is clearly a predictable increasing process, therefore 


E[j X dA,|=E[X In), 
and 
E{j X ,dA,|=E( IX, Jy). 


Hence, by Proposition 2, E[X, ,] = E[I1X,,I,]. Now, (II' X), is _ measurable, 
(117(I1' X)), is ¥2. measurable (Lemma 2 (a)). By the (F4) property and the 
convergence properties of conditional expectations, the o-algebras F,'_ and F,7 
are also conditionally independent, given ¥,_. Therefore [1X, is F,_- 
measurable, and a.s. 1X, = E[X ,|F, _]. (a), (b), (c) and (d) follows directly by this 
equality and similar arguments. 


Remark. If Z is a predictable stopping point (that is, Z is a random point such 
that [Z, — 00) is a predictable set), then 7X ,=E[X ,|F,_ }. 
We turn now to a characterization of the dual projection. 


Proposition 6. Let A be an increasing process. (a) A* is the unique increasing 
process satisfying 


Ej lXdA=Ej XdA" (2) 


for all bounced and measurable processes X. The process A" is predictable, and 
satisfies 


Ej l?XdA*=EjII'XdA*=E{ IXdA*=E j XdA’, (3) 


for all bounded measurable processes X. 
(b) A*=(A™)"! =(A™)", and A— A” is a weak martingale. 
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Proof. (a) Following Proposition 2, A” satisfies (1). The uniqueness of A* follows 
by the same arguments in the one-parameter case ([4], VT41): defining the 
stochastic measure u(X)=E[{ 11X -dA] and constructing from p(X) the unique 
increasing process A* induced by yu on the dyadic rationals then extending to all 
z. The final part of (a) follows from (2) by writing (2) for the case where X = /7' Y. 
(b) The equalities follow from (1). In order to show that A—A” is a weak 
martingale, let B,=A,— Al, therefore if X is predictable (X =17X) then E | XdB 
=0. In particular, z<z’, EaB(z,z']=0 for all bounded F,-measurable random 
variables «. Therefore B= A — A” is a weak martingale. 


Corollaries. Let X be a positive bounded and measurable process, then: 
(a) If Y is a predictable process then II(X-Y)=Y-IIX. 
(b) If X is predictable then 
(| X,dA,)" =| X,dA". 
(c) If A is a predictable and increasing process then 
(| X,dA,* =| 11X,-dA,. 


(d) If A is an increasing process and C is a predictable increasing process such 
that A—C is a weak martingale then C =A". 


The proofs follow directly from the properties of IJ, IT‘ and ( )* and the 
decompositions I] = II? II' and A* =(A*)" and are therefore omitted. 

The following results follow immediately from Propositions 4 and 5 of [7] 
and will be needed in the next section: 


Proposition 7 [7]. For every number q21 there exist constants C, such that: 
(a) For every increasing function A 


E(At)'S C,E(A,¥. 
(b) For every process X we have 


E(sup,|I1X J)" C,- E(sup,|X \)* 


The Extension of Stochastic Integrals 


Let M={M,,zelIR2} be a square integrable martingale and A an increasing 
process such that M?—A is a weak martingale [3]. Let <M) =A™ then <M) is 
also an increasing process and M*—<M) is a weak martingale. The stochastic 
integral {¢dM was constructed in [3] under the assumptions that ¢ is predict- 
able and Ej ¢7d<M)<oo. 

For the case where M is the Brownian sheet W, the stochastic integral fodw 
was extended in [8] to the case where f¢? dz<oo as. In view of the results of 
this note, the extension of [8] for the Brownian sheet goes over directly to the 
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general case of constructing {¢dM under the assumption that {¢7d<M)<oo 
a.s.; the details are therefore omitted. 

Turning now to stochastic integrals of the second type, recall [3] that the 
predictable sigma field on IR’, x IR? x Q is generated by sets of the form (z,,z/,] 
x (Z,,2,] x H, where (z,,z,] and (z,,z,] are rectangles such that for every ¢ in 
the first rectangle and ¢’ in the second, we have (A ¢’ and H is a set in Fas. 
=F,,,,. Let M be a strong martingale integrable in fourth power. Let aval 
([M]7) be the increasing 1—(2—) predictable process associated with M such 
that M*?—[M]' (M?—[M]?) is a 1—(2—) martingale [3]. 

The stochastic integral {j/yydMdM was constructed under the assumption 
that (¢,¢’) is a predictable process which vanished whenever (A (' is not 
satisfied and such that EH<oo where H= (ff w°(¢,¢’)d[M]?d[M]}, <oo. 

Ri x RY 


Let 
H,= \f w7(60)d[M]2d[M}}, 


R.xR-z 


Then it follows directly that H is both 1 and 2 predictable and therefore 
predictable. Consequently the method of [8] combined with the results of this 
note provide a direct extension of the stochastic integral {j /dM4dM to the case 
where H<oo as. A different approach to the extension of the definition of 
stochastic integrals was given recently by Cairoli [2]. 
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Disintegrating Measures on Compact Group Extensions 


Russell A. Johnson 
University of Southern California, University Park, Los Angeles, Cal. 90007, USA 


Introduction 


Let Y and Z be compact Hausdorff spaces, with (positive Radon) measures v 
and 4. Let X=YxZ, w=v@A, and let feL'(X, y). Fubini’s theorem states that 
the map r: yf f(y, z)dA(z) is defined v-a.e., is v-integrable, and ! S (x) du(x) 
“ r(y)dv(y). * 


It is of interest to attempt to find analogous formulas when X is not a 
product. Thus let X, Y be compact Hausdorff with measures p and v. Let 
nm: X +Y be continuous and onto, and suppose z(u)=v. One desires a map 
A: y-rA, of ¥ into the set M , (X) of positive Radon measures on X such that 

(*) Support (4,)<2~ *(y); 
(«*) if feL)(X, p), then r: y->f fda, is defined v-a.e., and if an~ Jr) dv(y). 


A map A satisfying senna: stronger conditions (see 0. $ and 0. 10) i is called 
a strict, v-adequate disintegration of 4 with respect to x. In [9], it is shown that 
every extension (X, y) of (Y, v) has a strict, v-adequate disintegration with respect 
to x if and only M®(Y, v) ({9], p. 15) admits a strong lifting ([9], p. 104). One 
may inquire what can be proved if M®(Y, u) is not assumed to have a strong 
lifting. This is the problem we consider. 

In §1, we consider two easy propositions. Consider a fixed pw, and any 
map’: Y-+M ,(X) such that y—A,(f) is v-integrable for each feC(X), and 
J f(x)du(x)=J A,(f)dv(y). Such a 4 always exists ([9], Chpt. VII). We note 
xX Y 


that, if feL'(X,u) is Baire measurable, then y—/,(f) is v-integrable, and p(f) 
=JA,(f)dv(y). Since every geL'(X,y) is equal ae. to a Baire measurable 
Y 


feL'(X, yw), this shows that { g(x)du(x) may be obtained from 4 and v by 
xX 


altering g on a set of measure zero. We also show that, if u is completion regular 
(0.4), then A does satisfy (**). In fact, 2 is v-adequate (0.8). 
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In § 2, we consider the special case when a compact group G acts freely (0.11) 
on X in such a way that X/G=Y. (The case when G is metric was treated in 
[10].) We show there is a map 4: Y>M _, (X) such that: ||A,|| =1 for all ye Y; (*) 
is satisfied; and, if feL'(X, u) is Baire measurable, than yA,(f) is also Baire 
measurable, with f f(x)d u(x) =f A,(f) dv(y). We call 2 a Baire disintegration of 

Xx Y 


Lt with respect to (2.2). If u is completion regular, then A is in addition a strict 
disintegration. In the process of obtaining /, we refine results in § 1 of [10]. 

The statements proved in §2 resemble theorems on the existence of Baire 
disintegrations proved by Maharam ([12]), and Edgar ([3]). These authors 
assume (in our notation) that X is a product Gx Y. They do not assume G is a 
group. (Edgar weakens the assumption that X be compact.) The fact that G is a 
group gives X enough structure to allow our results to be proved. Our proofs 
are independent of [3] and [11]. They also use no lifting theory. In fact, 
techniques using lifting theory ([{3], [6], [16]) do not seem to be readily 
applicable to our situation, since they do not (necessarily) yield strict disinte- 
grations. 

In §3, we consider invariance of the disintegrations of §2 under the action of 
a group T. We consider the case when (G, X, T) is a bitransformation group, T 
is locally compact separable, and yu is T-ergodic (0.12; this may be replaced by 
“T-invariant”). Using an idea of Varadarajan ([15]) and results of [10], we show 
that w has a Baire disintegration 4 with respect to x which is strictly T-invariant: 


J f(x-t)da,(x)=f f(x) dd,.(x) (yeY, teT, feC(X)). If u is completion regular, 
xX x 


then A is strict. 

This paper developed from an attempt to extend various propositions in [10] 
to the case when G is not metric. The paper [10], in turn, was motivated by the 
study of invariant measures on distal flows and distal extensions ([5]). The 


results presented here for group extensions apply also to general distal exten- 
sions. 


§ 0. Preliminaries 


Let X be compact Hausdorff. 


0.1 Definitions. Let M ,(X) be the set of positive Radon measures on X. Let 
M', (X)={ueM , (X)| ||u|| =1}. We give these spaces the topology of pointwise 
convergence on C(X) (the vague topology). We often write u(f) or <u, f> for 
§ f(x) du(x). 

xX 


0.2 Definition. If weM ,(X), let u* be its upper integral. Thus if g=0 is lower 
semicontinuous (l.s.c.), then u*(g)=sup {u*(f)| feC(X), fsg}; if h20 is arbi- 
* 


trary, then u*(h)=inf {u*(g)| g is ls.c., g=h}. We also write y*(h)=f h(x)du(x). 
See [1]. “ 


0.3 Definition. Call the smallest o-algebra of subsets of X which contains all 
compact G,’s the class of Baire subsets of X. 
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0.4 Definition. Let ne M , (X), and let yo be the restriction of u to the Baire sets 
of X. Say that pu is completion regular if the completion ([7], p. 55) of po is 
defined on all Borel subsets of X. Equivalently, is completion regular iff for 
every Borel Ec X, there are Baire sets A, B such that AC ECB and p(A~B)=0. 
See ([7], p. 230). 


0.5 Definitions. Let g map X to a compact Hausdorff space Z. Say that g is Baire 
measurable if g~‘(B) is Baire in X for each Baire subset B of Z. If g is continuous 
it is Baire measurable. Let Z be a topological space, and let g: X +Z. Make the 
usual definition of Borel measurability of g. If weM ,(X), say g is p-Lusin- 
measurable if, given e>0, there is a compact subset K, of X such that 
U(X ~K,)<e, and g|K, is continuous. 


0.6 Proposition. Let g map X to a compact metric space Z, and suppose g is Borel 
measurable. Let ueM ,(X). There is a Baire measurable map g: X >Z such that 
&(x)=g(x) p-a.e. 


Proof. The proposition is true if Z= * There is a homeomorphism i from Z 
onto a subset Z, of the Hilbert cube x [0, 1]. Write ic g(x)=(g,(x))72 ,. Let g; 
=g; -a.e. with g, Baire, and let a(x)=i-! [(Z;x));2 ,] where defined. 


0.7 Notation. Let Y be another compact Hausdorff space, veM,(Y). Let 
4: YM , (X) be such that, if feC(X), then y<A,, f> is v-integrable. Then the 
formula u(f)= <A,, f>dv(y) defines an element p of M ,(X); we write y= JA, 
-dv(y). See [2]. ” 


0.8 Definitions. Let A, u,v be as above. Say A is v-pre-adequate if, for each lL.s.c. 
function f 20 on X, one has that r: yA¥(f) is v-measurable, and y*(f)=v*(r). 
If A is v’-pre-adequate for all v’<v, then A is v-adequate. See ([2], §3, n°1, 
Def. 1). If X is compact, then / is v-adequate<>/ is v-pre-adequate ([2], §3, 
Ex. 7). If A is v-Lusin measurable, then 4 is v-adequate ([2], pp. 18-19). 


0.9 Proposition. If 2 is v-adequate and »={ A, dv(y), then (**) of the introduction 
¥ 


holds; thus feL'(X, uw) > u(f)=J A,(f) dv(y). 
¥ 
For the proof, see ({2], §3, n°3, Prop. 5). 


0.10 Definition. Let A, u,v be as above. In addition, suppose that 2: X —Y is 
continuous, with 2(u)=v. Say that A is a disintegration of » with respect to z if: 
(a) ||A,||=1 for all yeY; 


(b) emf i, dv(y). 


If, in addition 
(c) Support (A,)<2-"(y) for all ye, 
then 4 is a strict disintegration of u with respect to 7. 
0.11 Definitions. Let G be a compact group, T a locally compact group. Suppose 


(G, X) and (X, T) are left and right transformation groups ([5]), respectively. We 
write gx or g-x for the action of geG on xeX; similarly, we write xt or x-t. If 
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fEC(X) and geG let (f-g)(x)=f (gx); if weM , (X), let (g-u)(f)=H(F g). Simi- 
larly for teT. We say G acts freely on X if g-x=x=>g=idyeG (geG, xeX). If 
the actions of G and T commute (that is, if (gx) t=g(xt) for all g, x, t), then we 
say (G, X, T) is a bitransformation group. 


0.12 Definitions. Let (X, T) be a transformation group. Say that we M} (X) is T- 
invariant if wt=p(teT). If, in addition, u(A)=0 or pw(A)=1 for any set A 
satisfying u(A A At~')=0 (teT), then yu is T-ergodic. See [14]. 


§1 


1.1 Notation. Let X and Y be compact Hausdorff, let n: XY be continuous 
and onto, suppose weM ,(X), and let v=x(u). Let 4: Y>M,(X) satisfy: i) 


yA,(f) is v-integrable for feC(X); ii) u(f)=JA(f)dv(y) (feC(X)). Le, yu 
=f 4,dv(y). Such a map A always exists; see ([9], Chap. VII), or use the existence 
of conditional expectations ([13]). 
1.2 Remarks. (a) Note that, if f,(x)=c=const, then yA,(f,) is v-integrable. 

(b) If |, SK <o for v-a.a. y, then i) of 1.1 is satisfied. 


The following statements are easily proved using standard techniques 
((1], [3]). 


1.3 Lemma. Let N- X be a Baire set with u(N)=0. Then A,(N)=0 for v-a.a. y. 


1.4 Proposition. If f¢L'(X,) is Baire measurable, then yA,(f) is defined v-a.e., 
is v-integrable, and u(f)={ A,(f)dv(y). 
Y 


The proof of 1.5 also involves only standard arguments. Since the pro- 
position itself does not seem standard, however, we include a proof. 
1.5 Proposition. Let 1 be completion regular, A as in 1.1. Then A is v-adequate. 


Proof. By ((2], §3, Exercise 7), it suffices to show that, if f 20 is lower semi- 
continuous on X, then the mapr: y—A*(f) is v-measurable, and p*(f) 
* 


= |r(y)dv(y). Since f is -measurable ([{1], Chap. IV, §5, n°5, Corollary to 
Y 
Prop. 8), there is an increasing sequence (K,);_, of compact subsets of X such 
that f|K;, is continuous (i2 1), and N =X ~ |) K;, has p-measure zero. Let po be 
i=1 


the characteristic function of N, w; that of K;. 

Since p is completion regular, one has N cB, where B is a Baire set with p(B) 
=0. By 1.4, A,(W.)=0 v-ae., hence A,(f-Wo)=0 v-a.e. One also has A;- K;cB,, 
where y(B,~ A;))=0 and A;, B; are Baire. Let @; be the characteristic function of 
B{iz=1). Let f, be a continuous extension of f|K; to X. Then (1.3) y>A,(f;- @)) is 
v-measurable, and p( F,- 9) =J Ay f,-—)dv(y) (1.3). But B,;~ K; is contained in a 


Baire set of p-measure me: hence ,( F,- Q)=A,( f-W,)v-ae. So wf-w) 
“Jats W)dv(y) G21). 
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Now: 


u*(f)=sup u(f-y,)=sup J A(f-w,)dv(y) 


* * 
«* J sup 4,(f- p)dv(y)= J ASF —F- Wo) dv(y). 


But 
ASN=ASS —f Wot Wo SASS -—f- Wot Af Vo=AS(F -—F- Wo) v-a.c. 


Hence y—A}(f) is v-measurable, and y*(f =f Ar( f)dv(y). This completes the 
proof. 7 


Note that we can always replace / by a v-adequate map /’ if we define 1)(f) 
=p<A,,f>, where p is a lifting of M®(Y, v). In later applications, however, this 
will not be convenient, because strictness of A is not inherited by 2’. 

We also prove a lemma for later use. 

1.6 Lemma. Let X, Y, z be as in 1.1, and assume z is open. 
(a) If KX is a compact Baire set, so is m(K)< Y; similarly if Oc X is open 
Baire. 


(b) If we M ,(X) is completion regular, then 2(y) is also completion regular. 


Proof. (a) Write K = () O,, where ©, is open and cls O,, , <0, (i= 1). Then 2(K) 
ro) i=1 
= ia n(C)). 


(b) Let EcY be Borel. Then Acn~'(E)cB where A, B are Baire and 
p(B ~ A)=0. We may assume that A and ~B are countable unions of compact 
G,'s. Using (a), it follows easily that v is completion regular. 


§2 


2.1 Notation, Terminology. Let (G, X) be a transformation group with G com- 
pact, and suppose G acts freely (0.9). Let Y be the quotient X/G; we say X is an 
extension (or, group extension) of Y. Let 7: XY be the canonical projection. 
Then z is open. Fix weM ,(X), and let v=2(y). If H is a closed subgroup of G, 


we will often write 2, for the projection of X onto X/H. Note that C(X/H) can 
be canonically embedded in C(X) via f— fo my. 


2.2 Definition. A map4: YM ,(X) is a Baire disintegration of u with respect to 
7 if: 


(a) w= J 4, dv(y) (0.7); 


(b) ||, || =1 for all y (or, A maps Y to M}(X)); 
(c) If feC(X), then yA,(f) is Baire measurable. If, in addition, 


(d) Support (A,)<2~*(y) for all ye, then 4 is a strict Baire disintegration of 
pL with respect to z. 
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2.3 Remark. Let F be the class of bounded functions f on X such that y1,(f) 
is Baire measurable. By 2.2(c), F contains the continuous functions. A monotone 
class argument ([13], I, Theorem 20) shows that F contains all bounded Baire 
functions on X. 


The following result will be used. For the proof, see ([8], p. 85). 


2.4 Proposition. Let H be a closed normal subgroup of G, H + {idy}. There exists 
a closed normal subgroup K of G such that KSH and such that, if L 
=H/KCG/K, then Lis a Lie group, and (G/K)/L=G/H. 


Thus X/K is a free Lie group extension of X/H (observe that (G/K, X/K) and 
(G/H, X/H) are free transformation groups; i.e., G/K and G/H act freely). 

We will need the following proposition concerning free Lie extensions. It is a 
refinement of ([{10], Theorem 1.9). 


2.5 Proposition. Suppose (during 2.5 only) that G is a Lie group. Then there is a 
strict disintegration (0.10) n: YM ,(X) of u with respect to x such that i) n is v- 
Lusin-measurable (0.5, 0.8); ii) if fe C(X), then the map yn,(f) is Baire measur- 
able. I.e., n is also a Baire disintegration. 

Hence yn,(f) is Baire if f is bounded Baire. 


Proof. We must modify the proof of Theorem 1.9 in [10]. We first outline that 
proof. 

For each xeX, use ([14], Sect. tas Theorem 1) to construct a compact 
neighborhood V, of x, satisfying G-V,=V,, and a compact F.c V, such that F, 
intersects each fiber 2~' 2(X) in a single point (xeV,). Choose x,,...,x, such that 


V,,U...UV, =X. Let B,=V,,, B;=V,,~ U V,, (2SiSn). Let t: YX be de- 


fined by {t(y)}=2~'(y)F,,, where i is Giana by the condition 2~'(y)<B;. 
Then t is a v-Lusin-measurable, Borel measurable section of X over Y. 

The mapv: Gx YX: (g, y)—>g-t(y) defines a measure p’=v~1() on ox y. 
Letting 1,: Gx YG be the projection, one uses the Dunford-Pettis theorem 
((9], p. 89) to obtain a v-Lusin-measurable map@: YM} (G): y—@, such that 
1, (u')=f @,dv(y). Finally, if feC(X), let f,: GOR: f,(g)=f (g-t(y)), then define 

Y 


i,(f)=,(f,). It turns out that 7 is a v-Lusin measurable, strict disintegration of 
Lt with respect to z. 

We wish to modify j and obtain a mapy such that y—n,(f) is Baire 
measurable. We make two observations. 

(1) By lines 10-15 on P.221 of [14], the sets V,,, F,, are inverse images under 
a continuous map of compact subsets of some R". Hence V,,, F,, are Baire 
subsets of X. Hence, each B; (see definitions above) and W,=7(B,)< Y is a Baire 
set (use 1.6(a); z is open). Also, the section t: YX is Baire measurable. 

(2) Since M1(G) is compact metric, we may find a mapw: YM‘ (G) which 
is Baire measurable, and such that w,=@, v-a.e. (0.6). 

Define 4 by replacing @ by in the definition of 7. Then 1 is still a strict, v- 
Lusin measurable disintegration of » with respect to z. 

Let feC(X). We show y-n,(f) is Baire. If ACR is Baire, then 


tyln,(Nea}= U Win ve ¥lo,(f,)e4}. 
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So we show that each W,n {y|@,(f,)€A} is Baire. First, the mapo;: W;>C(G): 
yf, admits a continuous extension 6; to 2(F,,). This is because the section t| W, 
admits a continuous extension to n(F,). Let &(x(F,,))=Q,;< C(G). Then Q; is 
compact metric. Define 


h,: n(F,,)>M?,(G) x Q;: y>(@,, &(y)), 
and 


hy: M‘,(G)x Q;>R: (p,h)— p(h). 


Then h, is Baire measurable (0.5), and h, is continuous. Now on w,(f,) 
=h,oh,(y). Hence, W,n {y|@,(f,)eA} = Wah; thy 1(A) is Baire. This a 
the phi 


We now return to the case when G is compact, not necessarily Lie. 


2.6 Theorem. Let (G, X), y, etc. be as in 2.1 There is a strict Baire disintegration A 
of u with respect to n. 


Proof. The proof is modelled on that of Theorem 4 in [8]. Let # = {(H,2")|H is 
a closed normal subgroup of G, and 7,,(u) (which is a measure on X/H) has a 
Baire disintegration 4” with respect to v}. Order # as follows: 
(H,,4")S(H,,4") iff H,>H, and Ae(fy=A(f) for all 
feC(X/H,)— C(X/H,). We ohare that # is inductive, and that, if (H,A") is a 
maximal element, then H = {idy}cG. 

To prove # is inductive, let J ={(H,,2'=A")|ieI} be a totally ordered 
subset. Let H,, = (\H. and let X;=X/H;, #;=%y_(u), X , =X/H .s Uo =Tn,,(H)- 


Suppose first J contains a countable cofinal subset I,; we let I, = {1, 2,3, ...}. 
Then U C(X;) is dense in C(X,). If feC(X,) (lel), define AP(f)= i(f) 


elo 
(izl, ieI,). Since J is totally ordered, A?(f) is well-defined. It is linear on 
UY C(X;), and |AP(P)|S\|f || (letting f=1, we see ||A?||=1). Hence AP has a 


ieIo 


unique continuous extension (again called 4}°) to all of C(X ,,). Now, 2.2(b) holds 
for AY. That 2.2(d) holds follows from the corresponding fact for A(iel). If 
fe C(x, ), write f = lim f,, where f,e C(X,)(ieI9) and the limit is uniform. Then 


2.2(a), (c) are easily checked for 4®: yA}. 

Suppose J contains no countable cofinal set. 

Then C(X ,,)= \) C(x). If feC(X,,), then feC(X;) for some i. Define AP(f) 
=A(f). It is easily: seen that 1° is well-defined, and that 1”: yA} satisfies 
2. 2(a)- (d). 

Since # is inductive, it has a maximal element (H,,,A%). Suppose 
H,, + {idy}. By 2.4, there is a closed normal subgroup K of G such that KE H,,, 
and H ,,/K=Lis a Lie group. Then (X/K)/L=X,,. 

By 2.3 and 2.4, there is a strict disintegration n: X ,,+~M'‘(X/K): zn, such 
that, if fe C(X/K) and r(z)=n,(f), then r is a bounded Baire function (bounded 
because ||7,||=1 for all zeX,,). Moreover, f r(z)du,,="x(f). Define A¥(f) 


Xo 
=A>(r) (yeY). By ({13]), I, Theorem 20), yA}(f) is Baire measurable. Also, 
Ux(f)=n,(r)=(by 1.3) fAP(r)dv(y)=J AX(f)dv(y). Hence 1*: yA satisfies 
Y Y 
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2.2(a), (c). It also satisfies 2.2(b), (d). Hence (K,A*) is a strict majorant of 
(H,,,4”). We conclude that H,, = {idy}. 


2.7 Proposition. If is completion regular, then admits a strict Baire disinte- 
gration with respect to x which is also v-adequate. 


Proof. Combine 2.6 and 1.5. 


§3 


3.1 Notation. We suppose that (G,X,T) is a bitransformation group with T 
locally compact separable. Other notation will be as in §2. We suppose y is T- 
ergodic (0.12). 

As indicated in the introduction, we will look for disintegrations 4 which are 
strictly T-invariant. Thus A must satisfy (A,.,)(f)=4,(t/) (teT, feC(X)). We first 
prove the existence of strictly T-invariant disintegrations for Lie group exten- 
sions. 


3.2 More Notation. From now through 3.10, we suppose G is a Lie group. Let 
Gy={geG|u(f-g)=n(/) for all fe C(X)}. Then G is a closed subgroup of G (the 
fixing subgroup of p). Let y be normalized Haar measure on G, and let y, be 
normalized Haar measure on Go. 

We must refine Prop. 5.4 and Lemma 5.7 of [10]. 

Let be the disintegration of » with respect to x constructed in 2.5. Recall 
that, in the course of that proof, we defined a v-Lusin-measurable, Baire 


measurable section t: YX. We also defined a v-Lusin-measurable, Baire 
measurable map w: YM‘ (G) such that (ny, f>=<@,, fy> if feC(X) (here f(g) 
= f(g-t(y))). 
3.3 Definitions. For fixed xeX, define 9,: G+X: g>gx. Let he C(G). Define a 
mapH: X-M}4(G): ¢H(x),h>=<n,,hogz'> (y=n(x)), where h(g)=h(g-'). 
Note that, if feC(X), then <n,, f>=<H(x), fog,> if xen~*(y). 
3.4 Lemma. (a) H(gx)=g- H(x) (geG, xe X). 

(b) H(xt)=H(x) p-a.e. for each teT. 

(c) H is v-Lusin-measurable. 

(d) H is Baire measurable. 

(e) H(x)=yo p-a.e. 
Proof. Parts (a), (b), and (c) are proved as in ([10], 5.2). Part (e) is proved as in 
([10], 5.4). 

We show that H is Baire measurable. Consider the mapv: Gx Y>X: 
(g, y)>g-t(y). It is bijective. Since t is Baire measurable, v is, also. Moreover, v 
takes Baire sets to Baire sets. To prove this, one uses two facts. i) The Baire sets 
of G x Y are generated by sets of the form A x B, where ACG, BcY are Baire. ii) 
The construction of t shows that each t|W, admits an extension to a continuous 
mapt;: m(F,.)>F.,.<X (here W,, F., are as in the proof of 2.5). Hence v|G x W, 
takes Baire sets to Baire sets, so v does, also. Now: following definitions, one 
sees that H is given by x>v~'(x)=(g, y)>(g, w,)+-g-@,>8-D,. Here {£-@,, h> 
=<g-a,,h> (he C(G)). All maps above are Baire, so H is Baire. 

The next paragraph is motivated by [17]. 
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3.5 Definition. Let « be a left Haar measure on T. Let 


={geL'(T,2)|p20, J g(t) da(e)= 1}, 


and let L, be a countable dense subset of L,. For each xeX and geL,, define 
an element H,(x) of M}(G) by <H o(X), h> = f e(t)<H(xt),h> da(t) (he C(G)). Note 
that, if we know the H,’s for peLo, we seniens them for all geL,. 

3.6 Lemma. (a) H,(g-x)=g-H,(x) (geG,xeX, peL,). 

(b) H,, is v-Lusin-measurable (peL,). 

(c) H, is Baire measurable (peL,). 

(d) H,(x)=7Yo n-a.€.(peL,). 

Proof. Parts (a) and (b) follow from the definition of H,, and 3.4(a), (c). 

(c) It suffices to prove that x><H,(x),h> is Baire for each he C(G). (Proof. 
H, is Baire iff uoH, is Baire for each ueC(M‘(G)). But the set 
{ue C(M}(G))|u(B)= <B, hs for some heC(G)} contains the constants and se- 
parates points, hence is dense in C(M‘(G)).) Now, the mapx—<H(x),h) is a 
bounded Baire function on X(heC(G)). Hence it suffices to show that (*)f,: 
x-+f g(t) f (x-t)da(t) is Baire for each bounded Baire function f: XR. But (+) 


holds if f€ C(X), so a monotone class argument ([13], J, Theorem 20) shows that 
(*) holds if f is bounded Baire. 

(d) Let B={xeX|H(x)=7 9}. Then by 3.4(b) and (e), u(Bt 4B)=0 for each t. 
For geL,, let B,={xeX| J P(t) W g(x - t)da(t)=1} (W¥,=characteristic function of 


B), and let By = ia} B,= 0. B,. As in ((15], Lemma 3.3), By-t = Bo (teT; i.e., By 


is strictly Tienes ye H(B,4B)=0, ic, p(By)=1. Note that By 
={xeX|x-teB for a-a.a.t}. Certainly, then, if xe€By, then H,(x)=yo for all 
geL,. 
3.7 Definitions, Remarks. Let A={xeX|H, (x)=H,,(x) for all 9,,9,€L ;}. 
Then A={xeX|H, (x)=H,,(x) for 9;,2€Lo}. By 3.6(d), u(A)=1; by 3.6(d), A 
=G-A={g- x|geG, xe A}. That is, A=2~'n(A). By 3.6(c), A is Baire. Define a 
X—M1},(G) as follows: H(x)=H,(x) for some (hence any) geL, if xe A; A(x)= 
if x¢A. 
3.8 Lemma. (a) A(x)=H(x) on a set of the form n-'(D) (DY), where v(D)=1. 

(b) A(g-x)=g- A(x) (geG,xeX). 

(c) A is Baire measurable 

(d) A(xt)=A(x) (xeX,teT). 
Proof. (a) From 3.4 and the definition of H, one has H=H on a set x~'(D). By 
3.4(e), 3.6(d), and the definition of H, one has 1=y(n~'(D))=v(D). 

(b) Use the definition. 

(c) On the Baire set A, H=H, for some geL,. On ~ A, H=the constant y. 
Both H,, and this constant function are Baire measurable. 
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(d) Let xeA, seT, peL,. Then 


<H,(xs),h> =| o(s~ *t)<H (xt), hy da(t)=<H,(x), h> 
T 


where p(t)=¢(s~'t). Thus H,,(xs)=H,(x). Since peL,, H,(x)=H,(x). Letting 
vary, we obtain H, (xs)=H,,(x)=H,,(x)=H,,(xs) for all g,,9,€L,. 

3.9 Definition. Define #: YM} (X): <i, f >=<A(x), fe 9,)> for some (hence 
any) xen~ *(y). 

3.10 Proposition. (a) 7 is a v-Lusin-measurable, strict disintegration of with 
respect to 7. 

(b) It is also a Baire disintegration of with respect to 7. 

(c) It is strictly T-invariant: i,.,=(f,)-t(t€T). 

Proof. (a) By 3.8(a) and 3.3, 7 and # coincide on a set D of v-measure 1. Part (a) 
follows from 2.5. 

(b) We first show that u: x<A(x),fo@,> is Baire measurable (fe C(X)). 
The mapx— fog,: X—C(G) is continuous. Let Q={fog,|xeX}< C(G); Q is 
compact. Write x(A(x), fo 9,)><A(x), fog,>: X-+M1(G)x QR. Both maps 
are Baire, so u is Baire. Now, u(x)=iio n(x), where ii(y)=<ij,, f> is a function on 
Y. We claim i is Baire. For, let y,(f j=] Sf (gx)dy(g) for some (hence any) 

PA 


xen~"(y) (feC(X)). Then y,eM}{(X), and yy, is continuous. By ([3], I, 
Theorem 20), yy,(u) is Baire measurable. But i(y)=y,(u). 
(c) Note 


<(,)-t.f > =<it,, tf > =< Ad), (tf) ° 9) 
= (H(x), fo 9.) =(3.8(@)<H (xt), fo Ox. = Cy. D- 
3.11 Theorem. Let (G, X,T) be as in 3.1. Then pu has a strictly T-invariant, strict 


Baire disintegration A with respect to x. If u is completion regular, then 4 is v- 
adequate. 


Proof. The proof is the same as those of 2.6 and 2.7, except that here J 
= {(H,2")|H is a closed normal subgroup of G, 4” is a strictly T-invariant, strict 
Baire disintegration of 7,,() with respect to v}. We use 3.10 in place of 2.5. We 
omit details. 


3.12 Remark. Using the characterization of T-invariant measures given in ([11], 
Theorem 1.5.4), one can replace “ergodic” by “invariant” in 3.11. 
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Counterexamples to Results of M.M. Rao 


N. Herrndorf 


Mathematisches Institut der Universitat K6ln, Weyertal 86, 5000 K6ln 41, 
Federal Republic of Germany 


Summary. Orlicz spaces and Prediction Operators in these spaces have been 
investigated by M.M. Rao in a number of well known papers. Since these 
papers are widely recognized as pioneering work in this field, it seems worth 
while to point out that some of his main results and many of his proofs are 
false. In particular M.M. Rao’s result on strict convexity of Orlicz spaces and 
his proofs of convergence theorems for prediction sequences are false. 


1. Introduction and Notations 


Let (Q, .<, P) be a probability space and E be a Banach space. ®: R-— [0, 00) is 
called a Young’s function (Y-function), if ® is symmetric, convex, ©(0)=0, 
®(x)>0 for x+0. If ® is a Y-function, let L,(Q,.%,P; E) be the space of all 
equivalence classes of strongly measurable E-valued functions X on 2, which 
satisfy: 
N : |X | 
o(X)=inf{k>0: f@ an ag dP<®(1)><o. 

Then by the same arguments as in the case E=IR (see: Luxembourg [5]) it is 
verified that (Lg, Ng) is a Banach space. If @c.@ is a o-field, define: 


Lo(B)={X €Lg(Q, A, P; E): there is a @-measurable 
function in the class X} 
1.1 Definition. Assume that for each X €Lg(Q,.o%,P;E) there is exactly one 
X,€L4(#), such that: 
(1.2) Ng(X —X_)=inf{Ng(X —Y): YeLg(M)}. 


Then define an operator P? on Lg by P?# X =X. 

in [6-13] M.M. Rao obtains important results on the operators P# and the 
spaces L,. Repentantly, some of these results are false others are proven in a 
false way or turn out to be trivial. 
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Section 2 of this paper discusses M.M. Rao’s results in [12] on the existence 
and uniqueness of X, in (1.2). It turns out that Rao’s existence proof is wrong 
unless E is reflexive. His uniqueness result is not true in general. 

The main problem investigated in [8] and [12] is the convergence of P#"X 
for a sequence of o-fields #, with @,<@,,, , for ne N. The results are extensions 
of the convergence theorem of Ando and Amemiya for prediction sequences in 
L, (1<p<). The proof of M.M. Rao’s a.e.-convergence theorem is not correct, 
for the generalizations of Lemma 1 of [1] stated by M.M. Rao (see Lemma 2 of 
[8] and Lemma 3.3 of [12]) are not true for general ®. The last statement of 
M.M. Rao in this affair is [13]. In this correction note M.M. Rao makes us 
believe that he always intended to state and prove the theorem under the 
additional condition that Lg has a certain property (*). (In [8] and [12] 
however, this property was stated as an immediate consequence of the definition 
of P#; see: [8] p. 171 and [12] p. 135.) Repentantly, this additional condition is 
very restrictive. It turns out (see Th. 3.2 below) that Lg has this property only if 
P assumes only finitely many values or ®(x)=c]|x|? with p>1. In the first case 
the results of M.M. Rao are obvious, in the second case i.e. for L,-spaces they 
are well known (see [1]). This seems to be at variance with M.M. Rao’s claim 
that his results “are the best possible in the context of Lg-spaces” (see [12] p. 
130). Notice that the proof in [12] remains false even in the restricted case, 
because M.M. Rao’s Lemma 2.6, which is “crucial for the work” is false 
(see Counterexample 3.4 below). We conclude our paper with a collection of 
some further errors of M.M. Rao, concerning Orlicz spaces. 


2. Existence and Uniqueness of Best Approximants 


In order to discuss the existence result of M.M. Rao, we specialize his Lemma 3.5 
of [12] to the trivial case that (Q,.o/, P) is a one point probability space. Then 
Rao’s Lemma yields: 

(i) Let E be a strictly convex B-space with the RN-property. Then: If CcE 
is a nonempty closed convex subset, then for any f,¢E the functional F(. ) 
defined by F(f)=||f—o|| assumes its minimum on the set C, whenever C is 
weakly sequentially complete. 


2.1 Example. Let E=/,={(a)eR™: )|a,|<0o} and |\(a)\|=),la,|+(¥|a,|?)". 
Then (E, || ||) is a Banach space and || || is strictly convex and equivalent to the 
usuai || ||, on/,. E has the RN-property, since (E, || || ,) is a separable dual space 
and the RN-property is an isomorphic invariant. However, every closed convex 
set CcE is weakly sequentially complete, since E is weakly sequentially 
complete (see Dunford, Schwartz [3] IV.8.6). Thus according to (i): For every 
nonempty closed convex set CcE and every f,eE there is feC such that || f 
—fo\| =inf {|| f—gl|: geC}. According to Singer [14] p. 99 Cor. 2.4 this implies 
that E is reflexive, which is obviously wrong. 

Therefore the existence result of Lemma 3.5 in [12] is false. The proof of 3.5 
is correct iff E is reflexive, for every uniformly integrable set H < L, (E) is weakly 
sequentially compact iff E is reflexive. 

Lemma 2.2 of [12] gives the solution to the uniqueness problem in (1.2). 
M.M. Rao claimes: 
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(ii) Lg(Q, &, P; E) is strictly convex, if E is strictly convex and @ is strictly 
convex, continously differentiable and ’(t)—> 00 for too. 
For the case E=R this was first stated in [6] Th. 4. The proof of Th. 4 yields 
another result that is explicitely stated in [10] p. 556, 557: 
(iii) If @® is strictly convex and continuously differentiable, then 
|X| 
joe Na) P=) for all XeLg(Q, xf, P; R)— {0}. 


The following theorem shows that M.M. Rao’s assertions (ii) and (iii) are 
wrong even in the case E=R unless { O(|X|)dP<co for all XeLg. In [15] 
Sundarasan has shown that L4(Q, »<,P; IR) is isomorphic to a strictly convex 
space Ls (Q, ,P;R), if satisfies a so called A,-condition (6(2x)<c(x) for 
all x2>X, with some x,20 and c>0). The proof of Sundarasan’s Th. 1 obviously 
also works under the sole assumption that { (|X|)dP< oo for all X eLg. In [11] 
Th. 1 Rao tries to generalize Sundarasan’s result to the case of an arbitrary Y- 
function ©. But the main tool in the proof is Th.4 of [6], which is false 
whenever { ®(|X|)dP=co for some X €L4. Thus Rao’s proof is false, whenever 
his theorem is a nontrivial extension of Sundarasan’s result. 


2.2 Theorem. Let ® be a strictly convex Y-function. Then: 
(2.3) (Lo(Q, of, P; IR), Ng) is strictly convex<>{ O(|X|)dP<o for all XELg. 
(2.4) If fO(\X|)dP= 00 for some X EL4, then there is YEL,—{0}, Y=0, such 


that | ® (=, 


N Nn) P< ot 
Proof. (2.3) “<=” is contained in the proof of Th. 1 in [15]. 

(2.4): We show first: If (2, F, 4) is a finite measure space, ® a Y-function and 
if there is YeLg(Z, F, uw; IR) with | O(|Y|)du=oo, then for every e>0 there is 
Y,€Lo(2, F, u; R), Y,20, such that: [ O(Y)du<e, { O((1+<e) Y,)du=oo. 

Let e>0 be given. There is Z,eL,, Z,20, such that [#(Z,)du<oo and 
fO((i+2)Z,)du=oo. Then for no fO(Z 1.7,.)4u— 0. Therefore one can 
take Y,=Z,1,7,.,, with a sufficiently large neN. 

If there i is YeLg(Q, of, P; IR) such that | (|/Y|)dP =o, there is a partition of 
Q into countably many disjoint Q2,, neIN, such that { (|¥|)dP=co for neN. 


Qn 
Now apply the preceding remark to (2, F »)=(Q,, J AQ,, P|. AQ,) and choose 
functions Y,eL4(Q2,, 7 AQ,, P| /Q,; R), ¥,20 such that: ! @(Y,)dP<e, and 


[i+] T)aP=a with ¢,:=27"~ ge Define Y:= pm Vilo, . Then YeLg 


* 10}, Y20, { O(Y)dP<4@(1) and fa Y)dP=o0 for all ae 1. Thus N,(Y)=1 
and this proves (2.4). 
Now we show: 


(2.5) If Y is as in (2.4), then there is an e>0, such that {aeR: N,(Y—a) 
= inf Nj(Y—b)}>[0, e]. 
beR 


Let Y be as in (2.4). Let ae R be arbitrary and 0<k<N,(Y). Choose m with 


k<m<N,(Y). Then: 
Y_k aS (1 = a 
mm k ~m!/ m(1—k/m) 
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Convexity of @ yields: 


“sett a Cel man! 
onfo(t)ars.o([")4P+ (1-5) ¢ Garam) 


Thus f 0 (—“)ap=c for all ke(O,N,(Y)), aeIR and therefore N,(Y 
)<oq) we 


—a)=N,(Y) for all aeIR. Since ® is continuous and /@ (5 (Y) 
obtain for sufficiently small a=0: ” 


(am) 4PS° (nam) *5¢ (Rep) arsow 


Thus there is ¢>0, such that Ng(Y—a)=N,(Y) for ae[0, e]. 
(2.3) “=>” follows now immediately from (2.4) and (2.5). 


3. Convergence of Pe "X for an Increasing Sequence Z, 
of Sub-o-fields of 7 


Since the main errors of M.M. Rao’s theorems on this subject persist for E=R, 
we restrict ourselves to this case. In [13] M.M.Rao made an additional 
assumption under which his proofs are allegedly correct, the (*)-property of Lg. 


3.1 Definition. Assume that P? is uniquely defined for every o-field Boo. Le 
has the (*)-property if P#(X Y)=YP2X for XeL, and bounded #-measurable 
Y and every o-field Bc oa. 


The Lg-spaces with this property will be described in the following theorem. 


3.2 Theorem. Assume that ® is strictly convex, differentiable (then ®' is con- 
tinuous and ©'(0)=0, since ® is a Y-function) and | ®(|X|)dP<.o for all XELg. 
Then P% is uniquely defined for every o-field BoA (see Landers and Rogge [4)). 


Assume furthermore that Lg has the (*)-property. Then at least one of the two 
following statements is true: 


(a) &(x)=c|x|? for some p>1, c>0. 
(b) P attains only finitely many values. 


Proof. Assume that (b) is not fulfilled. Additionally one may assume ®’(1)=1, 


for ®'(1)>0 and (Lg, Ng) does not change when @ is multiplied by a constant. 
Consider the following sets: 


T={u>0:4v(u)y>O0Vx20 @'(x)=uG'(v(u)x)} 
and for L>0: 


T,={u21:iv(uy>0Vxe[0,L] &(x)=uG'(v(u)x)}. 
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Let L>0 be given. Since (b) is not fulfilled one may choose four disjoint sets A; 


.» 4, @3S44,a,+a,< bot and 


@(2L) 
{P(B): Bed, Bo Q— U A;} is an infinite set. There is a sequence (B,),-n in A 


4 i=1 
with B,<cQ— |) A;, such that: b,=P(B,)>0, bn< east aah b.>b, 44 
i= 
for keN and b,—>0. Let now keN be fixed. Let @=o{A,VAz, 


A,VA,UB,}. For a20 let X,EL, be defined by: 


=1,...4 in &%, such gor oi Saar rime 


on A, 

on A, 

on A, 

on A,UB, 
elsewhere 


It is easy to see that a— d(a)=inf{N,(X,—Y): YeLg(M)} is a continuous 
function and lim — oo. Since f (|X. 2L))dP=(a,+a,+b,) &(2L)< G(1) fol- 
a~ao 1 

lows 6(0)SNg(X Sor Therefore: {6(a): a=0}> Er 0). For a=0 there are 
uniquely determined oe c(a) such that: 

b(a) on A,VUA, 

P2ZX,={\c(a) on A,VA,UB, 
0 elsewhere 


It is clear that always c(a)e[—1, 1]. Now define a function f,: [—1,1]— R by 
t+ pad (=~ =) 
“(t)=a +(a,+b,)® 
f,()=a;0 (5) +(a,+b)0 (5 


f, has an absolute minimum at t=c(a). Using the differential calculus one 
obtains: c(a)e(—1, 1) since f/(—1)<0, f7(1)>0 and f{(c(a))=0. This gives the 


following formula: 
,(cla+1\ _ , (1—ce(a) 
2,0 ( 5(a) )=(a.+6)0'( 5(a) ) 


Since X,=X, on A,UA,UB, the (*)-property of Lg implies that c(a) does not 
depend on a. Therefore there is ce(—1, 1), such that: 


oO Go *)- (a+b) () for all a>0. 


1 
As a,<a,+b, we have c>0. Since {6(a): a20} > [= 0), we obtain a, ®'(x(c 
+1))=(a,+b,) ®’(x(1—c)) for all xe[0,2L] and hence: 


©'(x)=a3 (a, +b,) © (<x) for all xe[0, 2L]. 
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Set u, =a; '(a,+b,) for keIN. We have proved that for every keIN there is v, >0 
such that 


®'(x)=u,®'(v,x) for all xe[0,2L]. 


Since u,—= 7 a3'a,>0, O(1)=1 and (®’)-' is continuous, we have 
Up ew? (©) * (az a3)>0. Therefore we have v,v,;", $2 for sufficiently large k. 


Thus for sufficiently large ke IN and all xe[0, L] we obtain, applying the above 
relation twice: 


© (v, 0: x) =u, 1 P (v,x) =u, Uz * O'(x) 
Therefore u,u,', €T,, for all sufficiently large ke IN. Thus 1 is an accumulation 
point of T,, and since T,, is a semigroup with respect to multiplication and a 


closed subset of [1, 00), follows T, =[1, 00). As this holds for every L>0 and T is 
a group with respect to multiplication, we have T=(0, 00). Thus: 


YVu>04dv>0Vx>0 @&(x)=uG'(vx) 


If y>0 is given, apply the above relation with u= Since '(1)=1 and ®’ is 


oe 
®'(y) 


strictly increasing follows v=y. Thus we obtain: 
Vx>O0Vy>0 G@(xy)=H'(x) O'(y). 


According to Dieudonné [2] this implies ®’(x)=x* for x>0 with some seR, 
and s>0, since ®’ is strictly increasing. 


3.3 Remark. Even if P is purely atomic with finitely many atoms, the Lg-spaces 
have not necessarily the (*)-property. The proof of 3.2 shows that for strictly 
convex differentiable ©, even in the case of 4-point probability spaces, ©’ must 
fulfill several equations of the form ®'(x)=u@'(vx) xe[0,L] with u,v, L>0, 
whenever L, has the (*)-property. 

Since differentiability and strict convexity of © is assumed in [12], Rao’s a.e.- 
convergence theorem 3.4 (3) of [12] covers only the case of L,-spaces with p>1, 
and some Lg-spaces, where P attains only finitely many values so that 3.4 (3) is 
trivial. Even for the cases covered by Rao’s theorem the proof remains false and 
so is the proof of 3.4 (2). According to page 134 Lemma 2.6 is “crucial for the 
work of section 3”, and it is indeed his main tool. This Lemma however is false. 
This is a part of the statement of 2.6: 

(iv) Let ® be a Y-function such that ®’(x)— 00 as x—> oo. Let sup {®'(x)/x: 
0<x<.0}<oo. Then there is a constant c>0, such that for x’+0 and any x one 
has: 


&(x')= D(x) + O'(x)(x —x’) +c B(x —-x’) 


The following example shows that (iv) is false. 
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3.4 Example. Define a strictly convex differentiable Y-function ® as follows: 
(x)=4x?0<x<1 
1 
P(x)= O(n) + (x —n) O(n) +(x—n)? — nSxSn+1 


@(x)=@(—x)x<0 


1 
Then @'(n+1)=1 +t... +7 for neN. Furthermore ®’(x)/x is continuous on 


(0, co), lim ®’(x)/x=1 and ®'(x)<n for xe[n,n+1]. Therefore sup {@'(x)/x: 
x70 


x 20} < oo. (iv) implies: There exists c>0, such that: 
@(x')= B(x) + O'(x)(x’—x)+c@(x—x’) for all x,x’eER. 
Setting x =n and x’=n+1 one obtains: 


@(n+1)2>H(n)+GO'(n)+c@(1) for all neN. 


1 
This gives 7, =o PU) for all neEN, which is impossible, since c>0, &(1)>0. 


Some further errors of M.M. Rao: 
(i) For E=R 3.4 (4) of [12] implies that P? XeL,(@) is determined by 
fO(xX—P2X|)dP=inf [(|X—Y|)dP (see Landers, Rogge [4], remark 31). 
YeLo() 


But this would imply P?(1,X)=1,P%X for Be@, and this can only be true for 
those L,-spaces, which are described in Th. 3.2 above. 

(ii) In order to apply Th. 3.6 of [12] in the way Rao does in the proof of 3.8, 
he must know that |)L4(Z,) is dense in Lg(@,,); this is not true in general if ® 
does not fulfill a 4,-condition. 

(iii) On page 135 of [12] M.M. Rao mentions a result that is taken from [9]. 
He claims: “P# is the usual conditional expectation (if and) only if (when E is a 
uniformly convex B-space) L,(E) is L,(E)”. Of course this cannot be true. The 
conditional expectation does not change, if an equivalent norm on E is in- 
troduced, whereas P? depends crucially on the norm, even for (x)= x?. 


These notes are a part of the authors diploma thesis written under the guidance of Prof. D. Landers. 
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A Note on the Convergence of Sequences 
of Conditional Expectations of Random Variables 


Zheng Wei-an 


Department of Mathematics, Shanghai Normal University, Shanghai, China 


Summary. We disprove two theorems on the convergence of sequences of 
conditional expectations of random variables in [1] by providing a counter- 
example. 


Let (x,),- 1,2, be a sequence of random variables which are uniformly inte- 
grable. It is well known that 


lim E[x,]$E[lim x,] (1) 


n—-c n-0o 


but, if we replace the symbol E[.] by that of the conditional expectation E[.|9] 
where Y is some o-field, will the inequality (1) still remain true? Such a 
proposition needs a proof. But Liptser and Shiryayev state it as Theorem 1.2 of 
their book [1] without giving a proof. They also state without proof (their 
Theorem 1.3) that if O<x,—-x as., E[x,]<0o, then 


E[X,|9]——> E[X|9] as. (2) 


if and only if (X,,) are uniformly integrable. Both theorems are not true. It is the 
purpose of this note to disprove the validity of the two theorems by proposing a 
counter-example as follows. 

Let Q=(0,1] x (0,1] be the unit square in the Cartesian plane, # the Borel field 
on that square, and p=yxyp the product measure, where yp is the Lebesque 
measure on (0,1 ]. 

Thus, (Q, Z, p) forms a probability space. For every positive integer n, write n 
=2*+j—2, where k is a positive integer, with j=1, 2,..., 2**'—2*. Then, to 
every n there corresponds a pair of positive integers (k,j) which we shall denote 
as functions of n by (k,j)=(K(n), J(n)). 
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n DK(n) => 2K(n) 


A -{s: Sea <$S3eo- 


1 
B,=}t 0<y Sze. 


X,(5,)=21, 5 (s,t), 
G=B,x(0,1], 


where &, is the Borel field on (0,1]. Evidently Y is a sub-o-field of Z. For every 
BeG, we have B=B, x(0,1] where B,€@,. Furthermore, 


fx,dP= jf yg hsiOdP 
B 


Bo x (0,1] 
= 2K uA, 0 Bo] u(B,) 
= L[A, a) Bo] 


= § I, ned? 


Bo x (0,1) 


os j I 4, x(0,1)(S, t) dP. 
B 


Hence 
E[x,|9]=14,.(0,1) 48. 


Since x,—-0 and E[x,]—0, (x,) are uniformly integrable. They satisfy the 


conditions of Theorems1.2 and 1.3 of [1]. But lim E[x,|9]=lim 14, 0,1 
of | >0=E[lim x,|Y] a.s. leads to a contradiction. 


n-—0o 


Note of a referee: The example above also shows that the Corollary to 
Theorem 1.2 and 1.3 of [1] is wrong. 
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The D.L.R. Conditions for Translation Invariant 
Gaussian Measures on ’ (R“) 


R. Holley* and D. Stroock** 
Mathematics Department, University of Colorado, Boulder CO 80309, USA 


Introduction 


Let Y(R*) be the Schwartz space of real-valued C” functions on R* which, 
together with all their derivatives, are rapidly decreasing, and denote by /’(R’), 
the dual of (R‘), the space of tempered distributions. If Y is an open set in R%, 
let F, be the o-algebra generated by the functions p>g(f) as f runs through 
Ce(Y); and if S is an arbitrary non-empty subset of R%, define = () Fs., 


e>0 
where S*={xeR*:|x—S|<e}. A probability measure » on #'(R*) is called a 
Markov random field if for all bounded open CR‘ and all bounded F,- 
measurable ®: Y’(R*)> C, E“[®| %,4.] =E"[®| A,4] (a.s., 1). 

There are many known examples of Markov random fields in this context 
(cf. [4], [8], and Theorem (1.5) below). Simplest among these are those which 
are Gaussian and have conditional marginals which are translation invariant. It 
is with such fields that we will be dealing in this paper. Indeed, the problem that 
we want to solve is that of describing the set of Markov random fields v which 
have the same conditional marginals as a given Gaussian translation invariant 
Markov random field y. That is, given p, let , be the set of all v such that v|, ia 
<ulg, for all bounded open 9s and E’(®|.%,.]=E"[®| 4] (a.s., v) for all 
bounded open #s and all bounded 4,-measurable #’s. We want to describe 
M,,. (It is hard to miss the analogy between this problem and the problem of 
phase transition for a lattice gas as formulated by Dobrushin, Lanford, and 
Ruelle. Indeed, the problem is the same, the only change is in the context.) What 
we will show is that if the covariance of p is given by (gy, (— L)~' @), where L is a 
constant coefficient differential operator, then (under mild conditions on L) the 
extreme elements of ./, coincide with the translates of » by tempered distri- 
butions H satisfying LH =0. As a consequence we see that if L1 =0 then there 
are many translation invariant Markov random fields with the specified con- 


* Research partially supported by N.S.F. Grant MCS 77-14881 A0O1. 
**x Guggenheim Fellow. 
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ditional expectations. This situation should be compared with those in [1], [4], 
and [7]. Also the analogy between our results and those of Dobrushin [2] 
should be noted. Qualitatively the results are the same; however, in [2] 
Dobrushin concerns himself with random fields over the integer lattice. 

Finally, once we have obtained the description of ,, we have been able to 
isolate an analytically prescribed subspace S of Y’(R*) such that u(S)=1 and the 
only extreme ve.@, with v(S)>0 is v=y. 

We take this opportunity to thank L. Accardi for mentioning this problem to 
us. Our only regret is that he did not have time to tell us for what he wanted to 
use the solution. We hope to eventually find out. 


Section (1) 


Up through Theorem (1.5), the contents of this section are simply our in- 
terpretation of Nelson’s ideas. We have put in the details mostly to satisfy 
ourselves that Nelson’s scheme works without a hitch even in the “mass free” 
case. Furthermore, we will need the notation introduced along the way. 

Let ¢: R?->R' be an even non-negative polynomial and denote by L the real 
constant coefficient differential operator whose symbol is o. That is Lf =(af)’ for 
feS(R‘), where “” and “™ are, respectively, the symbols denoting Fourier and 
inverse Fourier transform. Throughout we will assume that 


1 
(1.1) — dx<o. 
(x: Fe a(x) 


‘. ay ; 
Next, define A!/?: f(R*) > L?(R‘) by A!? f= (<2 f ) and introduce on S(R*) 
the inner product (-,-), given by (f, g),=(A’/? f, A’ g) (throughout (-,-) stands 
for the usual Hermitian L?(R‘)-inner product). Complete /(R*) with respect to 
the norm ||-||, determined by (-,-),, and let #, denote this completion. It will 
be convenient for us to identify #, with the space of ye¥’(R‘) such that 


a Ngee : , 
7EL},.(R*) and Sm |x(x)|? dx <0. That is, we will think of #, as a subspace 


of ¥'(R‘). Clearly, the action of ye.%, as an element of /’(R‘) is given by y(f) 
={ x(x) f(x) dx. Observe that if peCx(R*) with fp(x)dx=1 and p,(x) 
=e~‘ p(x/e), e>0, then for any peH,: 


1 ae ms 
ey Pon ee nc 2 
le*x—xlla pr [A(ex) x(x)— x(x)? dx +0 
as ¢10, by Lebesgue’s dominated convergence theorem. Hence if ye #, and 
supp(y)<-¥Y (cx means “compactly contained”), then we can choose {f,}%° 
SC (¥) so that || f,—z\|,479. 
Now let y be the probability measure on #’(R*) such that 


(1.2) E* [ei] =e V2 Da, feS(R%). 
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Clearly » is Gaussian and translation invariant. Denote by #%, the closure in 
T?(u) of the set of random variables o(f), feS(R%. Thinking of #(R*) as a 
dense subspace of #%, and noting that f¢(f) is an isometry taking (R*) into 
a dense subspace of #,, we see that #, is isometrically isomorphic to ¥#%,. 
Given ye #,, we will use X, to stand for the element of %, into which x is sent 
by this isomorphism. 

If S+@ is a subset of R4, we use F, and F, to denote, respectively, the o- 
algebras over /'(R*) generated by {p(f): feY(R*) and supp(f)cS} and 
{X,: ~¢H, and supp(z) S$}. Clearly ¥,<4¥,. On the other hand, if S is open 
and ye, satisfies supp(z)<S, then we can find {f,}P °C (S) such ||o(f,) 
—X Mhe2w=llf.—Xlla79 as n> 00. Thus 


(1.3) F,=F,(as., uw), S open in R’. 


Next set /,= (|) Fs. where S*={xeR*: |x—S|<e}. Because of (1.3) 


e>0 


G5=As=(\F5 (as. yu). 
e>0 


(1.4) Lemma. Let Y be a bounded open subset of R‘ and denote by x the 
orthogonal projection in #, onto the subspace {ye H,: supp(y)SY*}. Then for 
any geCo(Y), X,, iS Fyg-measurable and Z,=(g)—X,, is a Gaussian random 
variable which has mean 0 and is independent of Fye. 


Proof. To see that X,, is F,g-measurable we need only check that supp(zg) 
cdéY. Certainly supp(zg)¢¥Y°. At the same time, if weCp((Y)), then 
LweCp ((Y)) and so: 


ng(W)=(ng, LW), =(g, LW), =(g, y)=0. 


Thus supp(zg)S0g. 

To prove the desired properties of Z,, note that #%, under p is a Gaussian 
family of mean 0 random variables and therefore Z, is certainly a mean 0 
Gaussian random variable. Furthermore, if ye %, with supp(y) x ¥*, then 


E*(Z,X,] =((I—7) g, Hs=9. 
Hence Z, is independent of F,.. Q.E.D. 


(1.5) Theorem (Nelson). Let Y be a bounded open set in R4 and ®: #'(R*)>C a 
bounded F,-measurable function. Then E"(®| %4.]=E"[®| Hg] (a.s., u). In par- 
ticular, there is an J,-measurable version of E"(®| %,]. 


Proof. Because p is a Gaussian measure, we need only check that for each 
geCo (GY): E*[e(g)|%,-] admits an .~/,,.-measurable version. To this end, set 
G,=((‘)') and define 2, accordingly (as in Lemma (1.4)) with Y, replacing 9. 
Then, by Lemma (1.4), E"[9(g)| Fgj]=X neg (a.S., ); and so, by the remarks 
preceding Lemma (1.4), E“[(g)| Fe,,1 admits a F(,g)2n-measurable version ®,,. 
On the other hand, by the martingale convergence theorem, E“[9(g)| Fu, 1] 
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+ E*[(g)| Ay.| as n> 00; and, again by the remarks preceding Lemma (1.4), 

E"[9(g)| %g-] =E"[e(g)| %,-] (a.s., uw). Hence, &, > E*[p(g)| Ay] (as., ), and 

therefore E"[p(g)| %,-] admits an ,4= () Fog)2n-measurable version. Q.E.D. 
n21 


We now know that p is a Markov random field. Define the set ./@, as in the 
introduction. Our next result shows that in general ./@, will contain elements 
besides yp itself. 


(1.6) Theorem. Let He ¥'(R*)\C®(R‘) satisfy LH =0. Then the translate uy, of 
LK by H (i.e. Ly is the distribution of +H under ) is an element of M,,. 


Proof. We must first show that Hyly,<Hly, for bounded open #'s, and clearly it 
is enough to do this when 9Y=B(0,R) for R>0O. Given R, choose 
neCp(B(O, 2R)) so that y=1 on B(O,3R/2), and set f= L(y H) and 


X(g)=exp[e(f)-1/2F fal, veS'(R%. 
Then for any geCo (B(0, 5R/4)): 


E*[e'*® X()]=exp[—1/2(g, g)at+ilg, fa] 
=exp[—1/2(g, g),+i(g, AL(nH))] 
=exp[—1/2(g, 8),+iH(ng)] 
=exp[—1/2(g, g),+iH(g)]=E""[e'?®]. 

Thus Xdy equals duy on Sp¢0, gy, AN SO Myler gio, 2) <Hleta,0, zy" 
To prove that E"*[®|.%J,.]=E"[®| 9] (as, wy) for bounded F,- 
measurable @’s, choose Ry >0 so that ¥< < B(O, Ry) and let R> Rg be arbitrary. 


Define n, f, and X as in the preceding paragraph relative to R. Then for any 
BE Gye. Bo, R)* 


E"#(@, B]=E"(X ®, B]=E"[XE"[®| yc], B] 
= E*[XE"[®| ,¢], B]=E"" [E"[9| 4], B] 


since supp(f)¢ B(0, R)° and therefore X is ./,.-measurable. Q.E.D. 


The rest of this section is devoted to proving the converse of Theorem (1.6). 
For this purpose, we want to show that every ve.@, is a stationary measure for 
a certain Ornstein-Uhlenbeck flow on ¥'(R‘). Since in an earlier paper [3], we 
classified all such stationary measures, we will be essentially done once we have 
shown this. 

Define the semi-group {T7,:t20} on S(R*) by T,f=(e-"?°f). For con- 
venience, we will use f, to denote T,f. It is shown in [3] that {7,:t20} 


determines a unique transition probability function P(t, y,-) on /’(R*) via the 
equation: 


(1.7) [F(o(f)) P(t, ¥,d9)=f7 (J fl? ds, y-W() F(y)dy 





The D.L.R. Conditions for Translation Invariant Gaussian Measures on ¥’(R‘) 297 


1 2 
for FeC,(R') and fe Y(R‘), where y(t, €)= Onn e~*/2*, P(t, W,+) is the tran- 


sition probability function for the Ornstein-Uhlenbeck flow alluded to in the 
preceding paragraph. 


(1.8) Lemma. Given f =(f,,...,f,)e(A(R)", let T}(t,*) denote the Gaussian 


measure on R" having mean 0 and covariance 


(0 49948)) so 


Then for FEC,(R"): 


=(F *IP(t, -)) (Wf )ps ---» W(Pe)- 


In particular, if Fe C?(R") and we define 


F'(E,, .--. oe Q=(F TP (t, *)(E1, ---> Ea 
then 


(1.10) J FQ») +» Of) Plt W, ge fag (fi), +», Wh) 


“f(, (fds (fps) wis * 


=F WLU) Se) WU Dd + MUDD ds: 


and so as t|0: 


(11) [FFU 5 ON) PC. d9)- FWD WUD 


. 7) 
12 ( F UnA) geae~ 3 WLM SE) WU WU) 


point-wise as well as in L!(y). 


n 
Proof. It is sufficient to prove (1.9) for F(é,,...,€,)=exp i - hj a in which 
j=1 


case (1.9) is a consequence of (1.7). Given (1.9), (1.10) follows by differentiating 
(F «I7(t,*)) (W(f,),),----W(f,),)) with respect to ¢. Finally, the point-wise 


convergence in (1.11) is evident from (1.10). To prove that the convergence takes 
place also in L'(u), observe that from (1.10) one obtains the estimate 


1 
UFO). --- fd) Plt, ¥, d—)— FW). 


scje{ (1+ DW LaF) as 
1 
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where C depends only on the [7(R*)-norms of the f;s and the C}(R")-bounds on 
F. Hence, since p is a Gaussian measure on /'(R‘) and s—> L( fj), is continuous, 
we see that 


1 
sup BE" Ef Fo fs. 9) Plt v9) 
<t< : 
— FUL, Wp] |<oo. QED. 


(1.12) Lemma. Let &, ¥eC,(¥'(R4)). Then 


(1.13) J YW) (g) P(t, v, dg) u(y) 
=| d(v)(j P(g) P(t, ¥,d@)) ud), t=0. 


In particular, u is P(t, W,+)-invariant, and so if f,, ...,f,¢S(R*) and FeC?(R"), 
then 


- 3 WL) 5 5E | Wid WF) ald) =0, 


ix=1 


(1.14) [3 Wied ae 3k 4 


Proof. We need only prove (1.13) for ®(g)=e'® and Y(g)=e'*® where 
f, geS(R*). But 


je” (fee P(t, w, dg) ud) 


. j a f 2 
=f eivie iM -17 JWI as ay) 


=exp 5 28+ fn 8+Ma—U21 WGN? as| 
=exp[—12g, gt Eat el+h gh 


The last of these equalities results from 
(ff) +] Will? as=|— eeed | i ax+ [ewe |f (x)? dx) ds 


“J f(x)? dx=(Lf)4 
plus 


(8, a= i e120) B(x) F(x) dx=(f, g,),- 


Since the final expression is symmetric in f and g, (1.13) now follows. 
The invariance of yw is obvious from (1.13) upon taking Y=1. Finally, 


combining the invariance of 4 with the convergence result in (1.11), one easily 
arrives at (1.14). Q.E.D. 


(1.15) Lemma. Let f, g¢C(R*) have disjoint supports. Then for all FeC?(R') 
und GeC,(R'): 
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(1.16) SFI? F’WS)- VLA) FWP) Gg) wd) =0. 


Proof. Clearly it is enough to prove (1.16) for F, GeC%(R'). Given such F and G 
and applying (1.14) to F(@(f))- G(@(g)), we obtain 


SFI? F°WP)—-WLA) F' WP) GW (g)) HW) 
= —J (ig? G’(w(g))— (Lg) G'((g)) FS) wd), 
since (f, g)=0. On the other hand, from (1.13) we know that 
JS F@) Pt v, dg)—FW(S)) Gig) ud) 
=f (J G(p(g)) P(t, v, dg) —G(W(g)) F(S)) mld). 
Hence, if we apply (1.11) to both sides, we get: 
SUIS? F’°WP)-WL/) FWP) Ge) wy) 
=| (lig? 6’ W(g)— (Lg) G (W(g)) FW) uy). 
After combining these two, we arrive at (1.16). Q.E.D. 


(1.17) Lemma. If Y is a bounded open set in R4 and feC%(Y), then for every 
FeC}(R'): 


(1.18) LSP") WEA) FWA) ulay)=0, Be A ge. 


Proof. Since fEC>(Y), we can find ¢,>0 so that supp(f) n(G‘°)"°= 2. Hence if 
0<e<eé, and geCe ((Y‘)'), then, by (1.16), for all GeC,(R’): 


SUSI7F’ WP) -—WL/) F'WL)) Gg) Hay) =0. 
Clearly (1.18) is a consequence of this. Q.E.D. 


(1.19) Lemma. If ve.M/,, then for all feS (R*) the random variable ~(Lf) has the 
same distribution under v and yw. Furthermore, for each feS(R*) and FeC}(R'): 


(1.20) SSI? FWP) — WALA) F'W(P) vd) =0. 


Proof. We need only prove the first part under the assumption that f¢C?(R%). 
Given feEC%(R‘), choose R>0O so that supp(f)<c <B(0,R). Then, by 
Lemma (1.4), we can write p(Lf)=X,,,;+Z ,, where Z,, is independent of 


Fo, rye Hence 


E’ [e422] = EY [E"[ei42-s | Guo, om)) = E*[ei4411] 


SINCE A > peF Bo,rye and ve.M@,. Thus if we can show that xLf=0, then we 
will be done. But supp(zLf)< B(0, R)° and so 


Lf |2=(n Lf, Lf) = 2L7 (x) Fo) dx=(nLf)(f)=0 
since f EC (B(0, R)). 
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In proving (1.20), we can and will assume that feC?(B(0, R)) for some R>0. 
Given FeC}(R'), define &(~)=|| f ||? F’(e(f))— (Lf) F’'(f)). Then by the pre- 
ceding ®eL'(v). Moreover, since ® is Ago p»-measurable, we can combine 
Theorem (1.5) and (1.18) to conclude that 


E*(®| oo z_l=O (as. pu). 


Hence since E’[®| % io pe]=E"[P| oo, py] (as, v), we have E*[®] 
=0. QED. 


It should be noted that if A: Y(R*)>Y(R%), then the first part of Lem- 
ma (1.19) shows that “,={} without any further ado, since we can in this 
case write every fe.f(R*) as Lg with g=Af. Thus all our machinery involving 
P(t, ,*) is relevant only when A fails to map /(R*) into itself. 


(1.21) Theorem. If ve.@,, then v is P(t, y, )-invariant. 
Proof. Let fe Y(R*) and FeC}(R') be given and define F' accordingly as in 


Lemma (1.8) (here n=1). Then by (1.9): 
§ F(e(f)) Pv, dg)—FW(S)) 
=1/2 (WAP FY’ WP) VLA (FY WE) 4s. 
Furthermore, by (1.20), for each s 


SAI EY WS) -VLA(FY Wf) vad) =0. 
Finally, but the first part of Lemma (1.19), 


sup E*LWLf)?]= sup E*LW(Lfy)*1<-. 


O<ss<t 
Hence we can apply Fubini’s theorem to complete the proof. Q.E.D. 


In order to arrive at our final result, we must borrow the following fact 


about the structure of P(t, ,-)-invariant measures from [3] (cf. Theorem (5.7) 
and Lemma (5.17)). 


(1.22) Theorem. If v is P(t, ,+)-invariant, then there is a unique probability 
measure m, on {He ¥'(R*): LH =0} such that v={ uym,(dH). 


Combining Theorems (1.6), (1.21), and (1.22) we arrive at our main result. 


(1.23) Theorem. If m is a probability measure on {He S'(R*) C®(R*): LH =0}, 
then | u,m(dH)e M,. Conversely, if ve.M,, then there is a unique probability 
measure m, on {He¥'(R*): LH=0} such that v={py,m,(dH). Hence, if 
{He S'(R*): LH =0} | C™(R’), then the mapping m— { ,,m(dH) defines a one-to- 
one mapping from the set of probability measures on {He¥'(R*): LH =0} onto 
a 
(1.23) Remark. The condition {He¥#'(R*): LH =0} < C®”(R*) is not so restrictive 
as it may appear at first. Indeed, for many choices of o(-), it is possible to check 
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this condition by hand (e.g. when o(-)=09(+)+0,(+) where o9(+) is homo- 
geneous polynomial of degree 2n such that o9(x)>0 for x +0 and o,(-) is a non- 
negative polynomial of degree strictly less than 2n). The best result that we 
know on this subject are due to L. H6rmander and can be found in section (4.1) 


of this book [4]. Perhaps the most useful sufficient condition is that o(-)>0 on 
R4~ {0}. 


(1.24) Remark. The introduction of P(t, ,-) may appear to be simply a device 
with which we have reduced the problem at hand to one which has already been 
solved. Indeed, a more direct route to Theorem (1.22) would run as follows. 
Starting from Lemma (1.19), we know that for any ve.@, and ge¥(R*) the 
distribution of g(Lg) under v is the same as it is under pu. Now suppose that for 
any feS(R*) we could construct {g,}9<S(R*%) so that || f—Lg,||,70 and f 
—Lg, vanishes on B(O, n)=({xeER*: |x|<n}). Setting f,=Lg, and h,=f—f,, we 
would have: o(f)=(f,)+ o(h,) where o(f,) under v is a mean 0 Gaussian with 


variance (f,, L~'f,). Furthermore, we would know that ¢(f,) Pa ; Where X, 


is a mean 0 Gaussian with variance (f,L~'f). Hence, it would follow that 


“a. Y, where Y, must be tail-measurable and therefore independent of 
X,. Without much trouble, it would also be possible to show that f +X, and 
f —Y, can be chosen so that X, and Y, are tempered distribution-valued random 
variables. Finally, it is clear that Y,, would have to be 0 for all ge (R*) and 
therefore that Y, must be an L-harmonic distribution. The result of all this would 
therefore be that we could write y(f)=X,+Y,, feS(R*), where f +X, and 
f—Y, are independent '(R‘)-valued random variables, X, under v is a mean 
0 Gaussian with variance (f, L~'f), and LY=0. Obviously, this is just what is 
needed to prove part of Theorem (1.22) not covered by Theorem (1.6). 

The preceding paragraph leaves the problem of constructing the sequence 
{g,}{. We have been able to construct {g,}{° in the case when L= — A, but the 
technique that we used does not appear to be readily generalized. This is the 
main reason why we chose the route via P(t, y,-). A secondary reason is that 
the connection between the D.L.R. conditions and the flow determined by 
P(t, w,*) seems to us to be of independent interest in its own right. 


Section 2 


We now have a quite complete description of .#,. In particular, we know that if 
the only He ¥'(R’) satisfying LH =0 is H=0, then .@,={y}. This will of course 
be the case if A maps (R*) into itself. On the other hand, if for instance o(x) 
=|x|?, there are an infinity of non-trivial He Y’(R*) such that LH =0. Thus it is 
natural to ask if it is not possible to isolate an analytically describable Hilbert 
subspace S of ’(R*) such that u(S)=1 and the only HeS satisfying LH =0 is H 
=0. If we can find such an S, then it is clear that #0 {v: v(S)=1} = {u}. 

In order to prove that S exists, we will make the following additional 
assumptions about o(-). Namely, we assume that there exists a 6>0 such that 


(2.1) J (a(x) “8+ *dx <0. 


{x: a(x) S 1} 





R. Holley and D. Stroock 
To carry out our program, we need some notation. For k20, let h, denote 
the k'* Hermite function: 


2 d* 2 
hy (x) =(2"!? 2" k!)— 1/2 (— 1) &* —" , xweR', 


Given a multiindex we] =({0, 1, ...,n, ...})* and x=(x,,...,x,eR%, set 


h,(x)=h,,(x4)..g,(%4). 


Recall that {h,: «€1} forms an ortho-normal basis in I7(R*) and that f €1?(R°) is 
an element of (R‘) if and only if {(f,h,): «€1} is rapidly decreasing, in which 
case ) (f,h,)h,>f as Ntoo in A(R’). 


lal SN 


(2.2) Lemma. For the 6>0 in (2.1): 
Lu (J: ce (Fla o(Tshyl?<co}) =I, 
n=0 a=1 
where {T,: t=0} is the semi-group on /(R*) introduced before (1.7). 
Proof. Certainly it suffices to show that 


e | Y Yd +lay-0472 lolTeh | <en 


n=0 ael 


But 


co 2 
E*| y ¥, (1 + lal) 84+??? | p(T, sh,) 


n=0 ael 


(2.3) =) (1 +Ja|)- errs ls eo Nha)? ax. 


ael 


Now write G,(A)= = e7 4" — ! e~*dU,(t), where U,(t)= ) 1=[t'/*]+1. By a 
n°<t 

standard Abelian canton (cf. ‘* 420 of [1]), we see that G,(A) is asymptotic to 

I'(i+1/6)4-*/? as 210. Since G;(A) is bounded for A in each interval [e, 00) with 


é>0, we now obtain: 
1 1/6 
G,)sC4(6)((-) v1). 


With this estimate, we get: 


J la ae 


SCL f(x)" +h, (oP dx 


{x: o(x) $1} 


+ J |h,@)? dx] 


{x: o(x) 2 1} 


SCo(S)L[hgllZocrq, f (ox) - 9+ dx+1]. 


{x: a(x) S 1} 


e-o) |h, (x)|? dx 
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But ||h, | ;-0cpa)S C,(d) (1+|al)"* (cf. (A.12) in [3]), and so there is a C(d,5)< 0 
such that 


co 


o, —n® a(x 2 ‘ 
jd a(x) |h,(x)|? dx < C(d, 5) (1 + ||)". 


Plugging this back into (2.3), we have: 


EP [x Y (+ lal“? o(TahgP | 


n=0 ael 


<C(d, 5) F (1+|al)-“*% <0. QED. 


ael 


We now define 
(2.4) =| 969 (R) pap? (Fla) -©4*27 (shy? < coh. 
n=0 ael 


It is clear that S can be given a natural Hilbert space structure and that S is a 
dense F,-subset of 7’(R*). 


(2.5) Lemma. If geéS, then 9(T,f)>0 as n> 00 for all fe (R*‘). In particular, 
if HES and LH=0, then H=0. 


Proof. Given feS(R*‘), we have: 


lo(Te A =| 2 (fh) P(Thsh,)| 


SID (Gh)? (1 + lee O4 47] "7| 


ael 


YA tla Ott (Toh 71", 


ael 
and, since geS, 


¥ (1+ lal)“ 94+??? \@(T,sh,)/?+0 as n>00. 


ael 
Finally, if HeS and LH=0, then H(T,f)=H(f) for all t2>0 and feY(R‘). 
Hence H(f)=lim H(T,sf)=0, feY(R‘%). Q.E.D. 
(2.6) Lemma. If HeS'(R*) and LH =0, then yy,(S)>0 implies H =0. 


Proof. If uy(S)>0, then, since uy(S+H)=1, Sa(S+H)+2@. Hence there exist 
g, WeS such that p= +H. But this means that O=LH =L(g—y), and so g=y. 
In other words H=0. Q.E.D. 


(2.7) Theorem. Let S be defined as in (2.4). Then Mc {v: v(S)=1} = {uy}. 
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A Structure of the Bang-Bang Representation 
for 3 x 3 Embeddable Matrices 


Halina Frydman* 


524 Tish Hall, New York University, Washington Square, New York, N.Y. 10003, USA 


Summary. Necessary and sufficient conditions are given for a 3 x 3 stochastic 
matrix to be embeddable by 6 elementary stochastic matrices (Poisson 
matrices). For a 3 x 3 embeddable matrix, a structure of the minimal Bang- 
Bang representation, ie. the one that contains the smallest number of 
elementary matrices, is obtained. Based on the minimal Bang-Bang repre- 
sentation an algorithm for determining the embeddability of a 3 x 3 stochas- 
tic matrix is given. 


1. Introduction, Survey of Results, and Summary 


We consider the embedding problem for Markov chains with three states. A 
nonsingular stochastic matrix P is called embeddable if there exists a two- 
parameter family of stochastic matrices 


{P(s,t) OSss<t<+00} 
satisfying P(s,t)=P(s,u)P(u,t) (O<sSuSt), 


lim P(s, t)=lim P(s, t)=1 (1.1) 
tls stt 
and such that P(0, 1)=P. 

The embedding problem was reformulated by Goodman [3] as a control 
problem for differential equations. Goodman showed that a nonsingular sto- 
chastic matrix P is embeddable if and only if there is a two-parameter family of 
absolutely continuous matrix functions {P(s, t), OSs <t< +o} satisfying (1.1). 


< Py, t)=P(s, t) Q(t) (t € N), (1.2) 


* | would like to thank Seren Johansen for helpful comments and stimulating discussions on the 


subject of this paper 
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é 
ag P= —Q(s)P(s,t)  (s¢ N) 


where N is a null set, and such that P(0, 1)=P. 
For each t20 


100-40: Gi: Z9, 94,29, i+), Py a=0} 
fe 


the class of intensity matrices. 

The embeddable matrices are thus the matrices that can be reached from the 
identity J via (1.2) and (1.3) using a suitable controller Q(-)€Q,. The intensity 
matrices form a convex cone and the extremal elements have at most one 
positive off-diagonal element. A stochastic matrix which can be reached via (1.2) 
or (1.3) using an extremal intensity matrix Q as a controller is called a Poisson 
matrix and is of the form e*2, A4>0. 

Applying the chattering principle from control theory, see [8], to the control 
system specified by (1.1), (1.2) and (1.3), Johansen [4] formulated the following 
characterization of embeddable matrices: any embeddable matrix can be appro- 
ximated by a finite product of Poisson matrices. Johansen [5] further proved 
that any matrix in the interior of the set of embeddable matrices has a 
representation as a finite product of Poisson matrices, i.e., it has a Bang-Bang 
representation. 

Frydman and Singer [2] obtained the complete solution to the embedding 
problem for the birth and death processes. They showed that the class of 
transition matrices for birth and death processes coincides with the class of non- 
singular totally positive stochastic matrices and that all transition matrices of 
birth and death processes admit a Bang-Bang representation. 

For 3 x3 stochastic matrices Johansen [5] proved, using geometric methods, 
that matrices on the boundary of the set of embeddable matrices admit a Bang- 
Bang representation, see also Frydman [1] for an algebraic proof. Characteri- 
zation of the boundary of the general embeddable matrices is an open problem. 

This paper relies heavily on methods and results in [1]. We will briefly 
summarize results in [1] needed here after we introduce the necessary notation. 
Notation. Throughout this paper we will refer to a 3 x3 stochastic matrix P as 
“a matrix P.” We will denote by P>0 a matrix with all elements positive, and 
by P20 a matrix with at least one off-diagonal element equal to zero. 

Let S= {(i,j, k)\(i,j, k) is a permutation of (1, 2, 3)} and let 


T; =P jiPux — P jxPri 

T,;= P jjPux— PjxPrj 
T=(-1)*!-" My, 
T..=M.: 


i ii 


(i,j, K)eS. 


Note that 


where M;;, M;; are second order minors of P. Observe that for every (i,j, k)eS 


det P=p,, The — P jn Te — Pik Tie = P jj Uj — P ie Tj — Pi Tj: - 
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Let E;; be a matrix with elements 


= 1 if m=i, k=j 
[Enso otherwise 


and A;,,(u) denote the following Poisson matrix 
A,(u)=I—uE,,+uE;,  OSu<1. 
Let c= We denote by Z;,(c) the inverse of A; ;(u), i.e., 


u 


Z,(C)=1+cE;,—cE;;, c=7— 


20. 


Let the stochastic matrix P have columns (p,,p,,p3) and let P,= = PZ; ,(c). 
Then a simple calculation gives 
Py if k+i,j 
Pe’=)pj—cp, if kj 
p(ite) if k=i 


det P, =(1 +c) det P. 


Note that P, = |\p{j|| satisfies is py =1; 1 <i<3 but may not be stochastic. 
It was shown by Goodman BI that Il pi2det P>O is a necessary con- 


i=1 
dition for embeddability of an n xn stochastic matrix P. For n=3 we proved in 


[1]. 
Theorem 1.1. Il Pi2det P>0 is a sufficient and marae condition for em- 


beddability of a matrix P=0 and in this case P= I A,, where (A,,, 1<m<5) 
are Poisson matrices. a 


A similar result does not hold for n>3, see Kingman and Williams [7]. 
For P>0 let 


T, 
B(n,m)=PanPmm—> 1, m=1, 2, 3, 
PD 


B(P)=max B(n, m) 


(n, m) 


¢,=min (ey, Pri Eat, (i,j, KES. 
Pi Pri Pji 


The following two lemmas show the significance of the function B(P) for the 
embedding problem. 


l 
Lemma 1.1. Let P=[]| A,,>0 and assume that B(P)<detP. Then also 
PA;'>0. mal 





In what follows p{7 denotes the (i, j)’th element of a matrix P,,; 


i etieie A 
and B,,(i,j)=p' Pi (my 
ji 


(m) _ »,(m) ,(m) __ ,,(m) ,,(m) 
Tj" = Pit Pek Pik Pri 


We call a matrix P regular if all its principal minors are positive, ie., if T,;>0 for 
all 1 <i<3. 


Lemma 1.2. Suppose P>0, det P>0, and B(P)<det P. Let P,=PZ,,(c) where 
(i,j,k) €S and 0<c<C;; so that P,>0. Then 

a) B,(n,m)<det P, for all (n,m) +(, i); 

b) If P is regular, B(P,)<det P,. 

These lemmas and Theorem 1.1 were used to prove 
Theorem 1.2. For a positive matrix P 


a) B(P)=>det P>0 is a sufficient condition for embeddability of P by at most 6 
Poisson matrices and the structure of the embedding product is one of the 
following: 


6 
b) If P is regular and P= || A,, then B(P)2det P>0. 
m=1 
With this background we can summarize the main results of this paper. The 
main result of section 2 is the characterization of 3x3 stochastic matrices 
embeddable by at most 6 Poisson matrices. For a positive 3 x3 stochastic 


matrix P a necessary and sufficient condition to be embeddable by at most 6 
Poisson matrices is 


Ti 
BU) mPuPy. zd P>0 for some i, j=1, 2,3 
ji 


or that i(i,j,k)eS and 0<c<@;; such that 
B,(j,ij=det P, where P,=PZ;,(c). 
Equation (1.5) is equivalent to the following quadratic equation in c: 
PjiPii ue + p;(det P — p;;T,,;— Pi TC + PP jj Tji— Pij det P =O. 


Thus deciding whether a 3x3 stochastic matrix can be embedded by 6 
Poisson matrices amounts to at most checking 9 simple inequalities and solving 
6 quadratic equations. 

In general Poisson matrices of different types do not commute, ie., A;; and 
A,, do not commute unless (I, p)=(k,j). The following concept of extended 
commutativity is crucial for the development of Sect. 3 and seems to be relevant 
for the embedding problem in general. 
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Definition 1.1. We say that Poisson matrices A,, and A, commute in the extended 
sense if for any 0<u,,u,<1 there are constants 0<w,,w, <1 such that 


A; (u,)A},(U2)= A,,(wW,)A;;(w2). 


A,,; and A,, commute in the extended sense if (I, p)=(k,j), (i,k), or (j,i) (see 
Lemma 3.2). A;; and A,; do not commute even in the extended sense. However 


for any constants 0<u,,u,,u,<1 we can find constants 0<w,, w,,w;<1 such 
that 


A; (U;) Ayi(Uz) Ajj(43) = Agi(W1) Ajj(W2) Ail) 
Similarly for A;,; and A,, (see Lemma 3.5.). 

In Sect. 3 we study the structure of the Bang-Bang representation for a 3 x 3 
embeddable matrix P>0O with B(P)<det P. Observe that the structure of the 
Bang-Bang representation for a matrix P>0 with B(P)>det P>0 is given in 
Theorem 1.2a). 

The main theorem of Sect. 3 is that the minimal Bang-Bang representation for 
an embeddable 3 x 3 stochastic matrix P>0 with B(P)<det P, ie., the one that 


contains the smallest number of Poisson matrices among all possible Bang-Bang 
representations for P, has the following structure 


— P=(AgiA jp AjjpAgiA jp(AjjAniA jx Aij---) for some (i,j, k)eS (1.6) 


where the product of Poisson matrices in the first parenthesis is a positive 
stochastic matrix, say P’, with the property B’(j, i)=det P’. 

Notice that the representation (1.6) consists of only 3 types of Poisson 
matrices that repeat in cycles of size 3. This is in contrast to the Bang-Bang 
representation for a matrix P>0 with B(P)>det P>0 which consists in general 
of 5 or all 6 types of Poisson matrices, see Theorem 1.2a). 

Johansen [6] showed that the number of Poisson matrices in the Bang-Bang 
representation for 3 x 3 embeddable matrix P is bounded by 6 times the smallest 
integer larger than or equal to (In 4)~*' In det P. 

This bound together with the knowledge of the structure of the minimal 
Bang-Bang representation for a 3 x3 stochastic matrix P>0 with B(P)<det P 
allows in principle to determine whether a 3 x 3 stochastic matrix is embeddable 
or not. The algorithm is discussed in Sect. 3. 


2. Sufficient and Necessary Condition for a Positive Matrix 
to be Embeddable by 6 Poisson Matrices 


We first prove several lemmas. The first lemma is a special case of Theorem 1.1. 


3 
Lemma 2.1. Let P be a matrix such that p,,=0 for some (i,j,k)eS. If |] pi 
i=1 


=det P>0 then P is embeddable by 4 Poisson mattices in the following way: P 
=A,;A Ajj Axi- 
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Proof. Since (p;,=0 and p;;P Pj, = det P) => (T;; <0) the proof of this lemma is the 
same as the proof of Theorem 1.1 for the case T;;<0, see [1]. Observe that 
3 


equality |] p,,=det P implies that one needs 4 rather than 5 Poisson matrices to 
i=1 


embed P. 


Lemma 2.2 Assume P>0 and for some (i,j,k)e€S B(i,i)Sdet P and B(j,i)=det P. 
Then P is embeddable by 5 Poisson matrices as follows: P = A,;A j,A;;A,;A jx- 


Proof. Assume B(j,i)=det P and let P, = PZ 4, (2), Then we have 


Pij 


tT; 1 
2). oa (2). (2). 
Pix =—> Px =9, Pye =—(—Ty) 
kk p ik D k 


ij ij 


T;i 
Pi PS} ke = PsP jj (142 a) =(1+ +2 Pix) det P =det P, 
ij ij Pij 
where the second equality follows by ges: ea 
Now 0<det P=p,;T;;—PyiTi—Pji Tj, and T;,>0 imply that T,;<0 since if 
T,;29 then p;;T;,>det P contrary to the assumption. Hence pi, “)>0. Thus by 


Lemma 2.1 P, = =A,,A_A,; ;A,; and hence P=A,;A,A;;Ayi:Aj,, aS we wished to 
show. 


Lemma 2.3. Suppose P>0, detP>0 and B(P)<detP. Let P,=PZ;,(c) where 
0<c<C;;. Assume i(i,j,k)eS and 0<c<C,; such that B,(j, i) > det P,.. Then there 
exists 0<c<C;,, such that B,(j,i)=det P, and P can be embedded as follows: P 
=A,;A j,Ajj;AyiA jp Ajj- = 

T; 
Proof. Assume B, (j,i)=p{}) pj) +> det P, for some 0<c<Z, 
Pi 


(1) that is 


ij 


ij? 


T..—cT:; 
F()=pi(P i-CD j) 1 > det P for some 0<c<¢,;. 
Pij— CPi 

Then it follows by continuity of f(c), 0<c<¢,; and the assumption f(0)< det P, 
that there exists 0<c<¢,,;, say c*, such that f(c*)=det P. Thus if we let P, 
=PZ;(c*), then B,(j,i)=detP,. But by Lemma 1.2a) B,(n,m)<detP, for 


— i). Hence application of Lemma 2.2 to P,=PZ;,(c*) completes the 
proof. 


Lemma 2.4. Suppose P=|| A;>0, B(k,j)2detP>0 for some (j,i,k)eS and 


i=1 
B(n,m) < det P for (n,m) +(k,j). Then P, =PA,'=0 if and only if A> =Z,,; a 


ik 


Ti; 
Proof. Assume A>! =Z,,; i) then p‘;)=0, r= (- T,;) and iar By 
assumption aie and p,;T;;<det P, hence det P= nal, — Puj Th j— Pig Tij <pyTy 


ji 
—p;jT;; implies T;;<0, showing tnt p\i>0 and thus P,=0. Notice that the 


condition B(k,j)=det P>0 ensures Il pi? > det P,. 


i=1 
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Now assume Av! +Z,,; (2 


Cg 


) but P,20. Then the condition B(n,m)<det P 


for (n,m) +(k,j) implies that Tl p\})<det P, which is impossible. This completes 
the proof. 


Lemma 2.5. Assume P>0, B(k,j)2 det P>0 for some (i, j,k)€S and B(n,m)<det P 
for (n,m) +(k,j). Then at least 5 Poisson matrices are needed to embed P. 


4 
Proof (by contradiction). Suppose P= [|] A,,, B(k,j)2det P, and let P,=PA;’. 
m=1 
Since 4 is the smallest number of Poisson matrices that can possibly embed a 


positive matrix we must have P, 20. Hence by Lemma 2.4 P, = PZ,; (2 £ Pai) and P, 


ik 
has only one element equal to zero, namely Pi. Now observe that the only way 


P, can be embedded by 3 Poisson matrices is for P, =P, Az‘ to have 3 elements 
ened to zero and these elements in addition to p=0 have to be pif) and p%?. 
We will now show that it is impossible to get p?)= pi; =0 thus deriving the 


(1) 


(1) 
contradiction. In order to get p{f)=p\j)=0 we must have Ay'=Z,, ir ) or 
ij 


1) (1) 
pit PH), However, notice that if we let P,=P,Z Gr 2). then 
j Ji 


p' 1 
Pi) = phy) — Pi p(n <a Ti <0 
ji ji 
since T, ;>0 and B(k, k)<det P imply that T,,<0 and hence T;!) = (1 +P) T,; <9. 
Pix 
(1) 
Now if P,=P,Zj, Ga) then p{i)=0, but 
ij 
(1) 
Pi 1 
Pit =P. ~~ Pip = ay (— Ty) 0. 
Pij ij 
This completes the proof. 
We can now prove 


Theorem 2.1. A necessary and sufficient condition for a positive matrix P to be 
embeddable by at most 6 Poisson matrices is 


BU.)=puPy Az det P>0 for some i,j=1,2,3 (2.1) 
or that i(i,j,k)eS and 0<c<C;; such that 
B,(j,i)=detP,>0 where P,=PZ;,(c). 
Equation (2.2) is equivalent to the following quadratic equation in c 
D Pui 1jj67 + Dis (det P — pj; T;j—P ji Ti) ¢ + PisP jj Ty — Pij det P =O. 
Proof. 
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6 


Sufficiency: If B(P)2det P then P= [| A,, by Theorem 1.2a). Next suppose 
m=1 
B(P)<det P and let P,=PZ;,(c)>0 be the matrix satisfying B,(j,i)=det P,. By 


Lemma 1.2a) B,(n,m)<det P, for (n,m)+(j,i) and hence by Lemma 2.2 P, is 
embeddable by 5 and P by 6 Poisson matrices. 


6 
Necessity (by contradiction). Let P= || A,,, Pj=PA;‘ and suppose that P does 


m=1 
not satisfy (2.1) or (2.2). Then by Lemma 1.1 P,>0 and by ume 2.3 
B(P,)<det P,. Next, applying Lemma 1.1 to P, we get 0<P,= Il A,,. Now 


since 4 is the smallest number of Poisson matrices that can peeailily embed a 
3 


positive matrix we must have P,= I] A,,29. Hence by Lemmas 1.1 and 1.2a) 


m=1 
B,(k,j)2det P, for some (i,j,k)eS and B,(n,m)<det P, for all (n,m)+(k,j). But 
then by Lemma 2.5 at least 5 Poisson matrices are needed to embed P,, 
4 
contradicting P,= |] A,,. 


m=1 


3. The Structure of 3 x 3 Embeddable Matrices 


We will denote by FP, ,, (i,j,k)€S, a positive matrix P which satisfies B(j, i) 
=det P>0 and B(n,m)<det P for (n,m) +(j, i). 


Lemma 3.1. Suppose that P>0 is an embeddable matrix but B(P)<det P. Then P 
can be represented as 


P=P;,)A,A2...4, for some (i,j,k)€S and some 
Poisson matrices A,,A,,...,A,, n21 (3.1) 


such that is we let R=P, A, Az... A,_5, 1SsSn—1 then R>0 and 
B(P)<detPR for 1<s<n-1. 
Proof. Immediate from Lemma 1.1 and Lemma 2.3. 


The representation described in Lemma 3.1 is highly nonunique. First, there 
may be more than one permutation (i,j,k)eS for which P has representation 
(3.1). Two, for any (i,j,k)¢S for which P has representation (3.1), there are many 
choices of the matrix F,, ;,, the Poisson matrices A,,A,,...,A,, and their number 
n such that P=P, A, A).. 

Let P be as in Lemma 3.1. Let S,eS denote the set of permutations for 
which P has representation described in Lemma 3.1. For (i,j,k)eS, let n;; 
=smallest n, i.e., smallest number of Poisson matrices A,,A,,...,A,, such that 
(3.1) holds. Consider the set R of representations for P. 


R={P,,yAy Ap... Ay, MC, ES,}- 
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Let n=min {n,,|(i,j,k)€S,}. We will call any representation in the set R for 
which nj;=i, a minimal representation for P and write it as R; ,A,A,...A,. We 
will call A, A,... A, a minimal product for P. 

The structure of PF, is given in Lemma 2.2. In order to investigate the 
structure of the minimal product for P, we introduce the concept of extended 
commutativity, see Definition 1.1. 

The following definition is identical in nature to Definition 1.1. 


Definition 3.1. We say that Z,;=Aj;' and Z,,=Aj,' commute in the extended 
sense if for any constants c,,c,>0 we can find constants b,,b,>0 such that 


Zi (C1) Zip(C2)=Zip(b1) Z;(b2). 


In all that follows the word “commute” is used in the extended sense. 
Clearly, A;; and A,, commute if and only if Z;; and Z,, commute. 


Lemma 3.2. A;, and A,, commute if and only if (I, p)=(k,j), (i,k), (i, i), (iJ). 
Proof. Clearly, A; ;(u;) Aj(U2) = Aj;(U2) A;(u;)=Aj(4u) where u=u,+u,—u, U5. 
A;; and A,; commute in the usual sense, i.e., for any 0<u,,u,<1 
A; ;(u,) Ayj(U2) = Ay (U2) Aj; (4). 
Now it is easy to check that 


A; ;(U,) Ay (U2)= Ay (W1) Aj(W2) 


uy 


if w, =(1—u,)u, and ba aa BO a 
27U,u, 


, while 
A; (U1) A j(U2) = A ji(W1) Aj (W2) 
uz 
l—u,+u,u, 
It is clear that A;, and A,, do not commute if (I, p)=(j,k) or (I, p)=(k, i). This 
completes the proof. 


if w,= and w,=u,(1—u,). 


Lemma 3.3. For any 0<u,,u,,u,<1 we can find 0<w,,w,,w3;<1 such that 
A i, (u,) A; (U2) Aj, (U3) =A; (W;) Aj (W2) A;;(W3) (3.2) 
Proof. It is a matter of simple computation to check that if we let 
U2U3 u,u,(1—us) 


tt a a) 





Se 
u,+u3;—U,u, 


then (3.2) holds. Clearly 0<w,,w,,w, <1. 


Lemma 3.4. Suppose P>0, B(P)<det P and let PR; ),A,;A,A3...A 
minimal representation for P. Then A, =A,; and A,=Aj,;. 


n=1 bea 


ni? 


Proof. If i=1 then Lemma 1.2a) together with the definition of FP, ,. show that 
A,=A,;. Next suppose i=2, that is P=P, A,A,. Clearly R; A, is then a 
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minimal representation for PA;* with i=1. Hence A, =A,,. We will show that 
A,=A,,; by elimination of all other possibilities. If A,=A,,, A,; or Aj, we have 
by Lemma 3.2 


P=P, Ajj(U;) A2(U2)=R;j, i) A2(W1) Ajj(w2) for some 0<w,, w, <1. 


But then Fi, ;,A42(w,)A;(w2) is also a minimal representation for P which is 
impossible since A,+A;,;. Hence A,+A,,A,;, Aj; and it remains to be shown 
that A,+A,,. Suppose to the contrary that 


P=P;,Ajj(U2)Ajp(u,) for some 0<u,, u,<1 
let 


Py=PZq(cy)Z,,;(c2) _ where o=—, i= 1,2. 


We will show directly that B,(j,i)<det P,, thus contradicting P, =P, j). 


We have 


(2) 
B,(j,i)=pi} pi? i 
ag (3.3) 
(rin) 
= - Pii [u- T-e1Ty)| (1+c,)(1+c,). 
l+c, 


(ru 75, ) 
Pij +c, Pii 


If B,(j,i)2det P, then clearly Tj?)>0 and we must have Tj?) =(1+c,)(1 
+c,)T,<0, and hence T;, <0, since otherwise det P,>0 would imply that 





p'; T;? > det P,, which by Lemma 1.2a) is impossible. Hence letting c= 2 in 
(3.3), using the fact that T,,<0 and defining P, = PZ;,(c) we get es 


B,(j,i) _@y-¢P ii) 
(+e\+e) yer) Pal TiC Ti+ Ty] 
(Pj;— CPi) - _ B,G,i) 
(P;;— CPi) PulTy ai ed Itc! 


=.) 





< 


Now observe that B,(j,i)<det P, or equivalently ———<det P since otherwise 


the minimal product for P would consist of 1 she ae 2 Poisson matrices. 
Hence B,(j,i)<det P, as we wanted to show. This concludes the proof for 7=2. 

If P=P, yA, A2... Ag, 2>2, is a minimal sr eee for P then PR; A, A, 
is a minimal representation for PA;' A;_', ... Ay’. Hence A,=A,, and A,=A,;. 


Theorem 3.1. The minimal representation for an embeddable matrix P >0 such that 
B(P)<det P has the structure 


P=P; (Ai; jAyiA jp Ai; ers )= (Ay; A Ai; Ani A w(Aij jAyiA wAij--) 


for some (i,j,k)eS on 
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where the product of Poisson matrices in the first parenthesis represents P,,i) and 
the product in the second parenthesis is finite. 


Proof (by induction on fi-size of the minimal product). 

The representation of R, ; is given in Lemma 2.2. The theorem was proved 
for i=1 and n=2 in Lemma 3.4. Assume that the theorem is true for i= N and 
suppose that a minimal product in a minimal representation for P is of size N 
+1, ite. 

P=P, A,A2...Ay, Any, for some (i,j, k)eS. 
Clearly PR; ,,A,A,...Ay is then a minimal representation for PAy},, hence by 
induction assumption we have 
A, Arp 


ije** 


A,A,... AyAy 41 =A;,A,y;AyA 





N matrices 


for some (I,p,r)eS. We will show that (I,p)=(j,k) by elimination of all other 
possibilities. Clearly (I, p)+(k,i) since by assumption a minimal product for P 
consists of N+1 Poisson matrices. Suppose (I,p)=(j,i), (k,j) or (i,k). Then 
applying Lemma 3.2 repeatedly we get 


P=F,;, iy Aij(U1) Ayi(U2) A j,(U3) --- Ayi(Uy) Aip(Un + 1) 
=F,;, iy Ayp(W1) A;;(W2) A,;(w3) A jx (w,4)... A; (Wy) Ayi(Wy 4 ) 
for some 0<wy,,W3,...,Wy41<1 
which implies that there is a minimal representation for P with the first matrix 
in the minimal product different from A,; which according to Lemma 3.4 is 
impossible. Hence (I, p)+(j, i), (k,j), (i,k). Finally, suppose (I, p)=(i,j), i.e. 
P=F,, Aj j(U1) Agi(Ur) A j,(u3) te A; ;(Uy _ 1) Agi(Uy) A; (Uy, 1) 
for some 0<u,,U3,U3,...,Uy4, <1. 


Then repeated application of Lemma 3.3 gives 


P=P;, jy Agi(W1) Ajj(W2) Agi(W3) --- Ajj(Wy) Agi(Wy 4 1) (3.5) 
for some 0<w,,W,W3,...,Wy,,<1 if N is even 


P=F,, iy Ajj(U1) A j(Z1) Agi(Z2) tee A; (Zy_ 1) Ayi(Zy) (3.6) 
for some 0<2z,,Z3,...,Zy<1 if N is odd. 


Thus when N is even (3.5) is a minimal representation for P with the first matrix 
in the minimal product different from A,,, while when N is odd (3.6) is a 
minimal representation for P with a second matrix in a minimal product 
different from A,;, which is impossible by Lemma 3.4. This completes the proof. 
Theorem 3.1 together with the bound on the number of Poisson matrices in 
the Bang-Bang representation, see introduction, suggest the following algorithm 
for determining whether a given 3 x 3 stochastic matrix is embeddable or not. 
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We start by asking whether a given matrix P>0, which is not embeddable 
by 6 Poisson matrices can be embedded by 7 Poisson matrices. Let P, 
= PZ, ,(c,)Z (C2) =P, Z i, (C2). By (3.4) the question becomes: are there (i,j, k)eS 
and constants 0<c,<¢,,, 0<c,<¢t)) such that B,(k,j)=det P,. If the answer 
is negative we ask about embeddability of P by 8 Poisson matrices. Let 


Ps =PZ;;(C1) Z (C2) Zyi(C3) = Py Z (C2) Zyi(C3) =P, Zi (C3) 


and ask are there (i,j,k)e€S and constants 0<c,<@,,, 0<c,<t\)), 0<c;<¢{?) 
such that B,(i,k)=det P,. We continue this way until we find the right number 
of Poisson matrices that embed a given matrix or reach one plus the upper 
bound, whichever is smaller. In the last case we conclude that the matrix is not 
embeddable. 
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A Note on Limit Theorems in Percolation 


Gunnar Branvall* 
Uppsala University, Dept. of Mathematics, Thunbergsvagen 3, 75232 Uppsala, Sweden 


Summary. Laws of large numbers and central limit theorems are proved for 
some cluster functions, e.g. the number of points in a large box which are 
(+) connected to its boundary or the number of (+) clusters in the box. 


1. Introduction 


We shall consider Bernoulli atom percolation in Z? and shall mainly adopt the 
notation of Russo [7], which is briefly as follows: 

Nearest neighbours in Z? are called adjacent and points which are nearest 
neighbours or diagonal nearest neighbours are called * adjacent. A set Ac Z? is 
connected [* connected] if for all x,yeA there is a chain of adjacent [* 
adjacent] points in A which has x and y as terminal points. 

The configuration space is Q={—1,1}7* and +1 are sometimes called spins. 
A maximal connected [* connected] component of w~!(1) is called a (+) cluster 
[(+)* cluster] of weQ. 

The measure is 
P(p)= [| v,(x), where O<p<1 and v, assigns weights p and 1—p to 1 and —1. 


xeZ? 
For xeZ?’, let C(x)[C*(x)] be those points which are (+) connected [(+)* 
connected] to x. Let N(x)=|C(x)|. N(0) is denoted simply as N and the variable 
NI(N <0) is called N’. 


Then some basic functions are: 

The percolation function P..(p)= P(N = 00). 

The mean size of finite clusters (susceptibility) S(p)= EN’. 
The number of clusters per site K(p)=EN~' I(0<N). 


The purpose of this note is to check some facts concerning the physical 
interpretation of these quantities. In Sect.2 some ergodic properties are men- 
tioned and Sect.3 contains central limit theorems. 
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We shall need some nice results concerning the moments of N’, which were 
obtained independently by Russo [7], or Seymour and Welsh [8]. 
Let p.=inf{p: P, >0}, z,=sup{p: PR, =0 and S(p)<o} and define p* and nt 


similarly. 
Theorem 1.1 (Russo, Seymour, Welsh). a) 1 —p* =2,Sp,=1—71*. 


b) For p off the interval [.,p,], E(N’Y <0 for any r. 
Especially b) will be repeatedly used in the sequel. 


2. Ergodic Theorems 


The following lemma is a well-known consequence of Birkhoffs ergodic theo- 
rem. Cf. e.g. Pitt [6], Theorem 5, p. 337. 


Lemma 2.1. Let (Q,4,P) be a probability space. Let T and U be ergodic 
transformations and suppose that fe, r=1. Then 


n—i n-1 


n-? ¥ YY f(T'U*w)->Ef as. and in E as n>. 
i=0 k=0 


Let T[U] be the translation of the spin configuration one step to the left 
[downward]. Then T and U are ergodic and the lemma may be applied to 


appropriate cluster functions to give alternative interpretations of the percola- 
tion functions. 


Notation. Let K, be the square {ze¢Z?: 0<z,,z,<n—1} and let the (inner) 
boundary 6K, ={zeK,:z, or z,=0 or n—1}. 


Theorem 2.2. Let N, be the number of (+) clusters in K, which contain no 
boundary point. Then 


n-*N,—K(p) as.andinany L, as n>. 


Remark. The convergence was shown by Grimmett [4], using a subadditive 
argument. The limit was identified as K(p) by Smythe and Wierman, Theo- 
rem 3.7 in [9], where they show that K(p) is differentiable a.e. We observe that 
the derivative exists and is continuous except possibly at p’. This follows from 
essentially the same arguments as Proposition 4 in [7]: 


Differentiating K(p)= >} |y|~' p!!(1—p)'*”! term by term one formally gets 
Oey 


y pll-1(4 —p)l@rl_- - eT ph — pier 


Oey Oey 


Here the summation index y runs over all connected subsets of Z? containing 
the origin. Since |dy|/|y| <4 and 

> pil —py!=PinSN <0) 

ivlen 


A slight elaboration of the argument shows that K’(p,) exists if P,(p,)=0 
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is an increasing function of p on [0,p,], where P.(p,)=0, the series above 
converges uniformly on the interval [0,p)]. On an interval [p,,p,], where 
P.<P,<p,<\, the uniform convergence follows from (4.4) of [7]. 

A similar argument using Theorem 1.1b shows that higher derivatives exist 
for p<n, or p>p.. 


Proof of Theorem2.2. Let X(x)=(N(x))~! I(N(x)>0). We then have the identity 
y X@=N,+ Y YC) (2.1) 


xeKy, xe0Ky, 


hye if x belongs to a (+) cluster of size k 
2 


Y,(x)= wis 
n(x) with k, points in K, and k, points in OK,, 
0 otherwise. 


Here )' Y,(x)<|@K,|=o(n’). 


xedKy, 


By Lemma2.1 n-? §\ X(x) => EX(0)=K(p) and the theorem follows. 


xeKy, 


Theorem 2.3. Let the sizes of the (+) clusters in K, be d{,...,d@ and let S, 


Nn 
=n~? ¥' (d™)*. Then, if S(p)<, $,—>S(p) a.s. and in any E. as n>, if S(p) 


i=1 
=00, §,7 0 as. 
Remark. The result ES,— S(p) has sometimes been used as a definition of S(p). 


Cf. Essam [3], p.221. A quantity much resembling S, has also been used in a 
Monte Carlo study of S(p). See Dean [2]. 


Remark. The results of Russo show that S(p) is infinitely differentiable for p<z, 
or p>p.. 


Proof. Suppose S(p)< oo and consider the identity 


LN'(x)=n7S,+ Y YC 
xeKy xeKy 
where Y,(x)=N’(x) 1(C(x) N6K,, +9). 
Since by Theorem 1.1 and Lemma2.1 n-? ¥° N’(x)—S(p) as. and in £, it 


xeKy, 


suffices to check that n~? )° Y,(x)0 a.s. and in LZ. Let e>0 and ng be given. 


xeKy, 


Then, if n is large enough 


a? > ¥@)=n? YF ¥Gjte? YL ¥) 


xeK, x€Ky\Kci- en xeKii-e)n 


<n? SY N(x)+n-?, Y N(R)IN'()ZNd). 


xe€Krn\Ki1- en xeKii-e)n 
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It follows from Lemma 3.1 that both of these terms converge a.s. and in I, the 
first one towards (1—(1—«)?)S(p) and the second one towards (1 
—e) EN’ I(N'=n,). As ¢ and nj, are arbitrary this ends the proof. 

The case when S(p)= 00 follows by truncation. 


Theorem 2.4. Let M,, be the number of points in K,, which are (+) connected to 
OK,,. Then 


a) n-?>M,—>P., as. and in any E as n>. 
b) For p<z,,n~'M,->44 in any LE as n>, 
where n= EY (0,0) and 


es if (i,0) belongs to a (+) cluster with k, points 
Y(i,0)=} 7 
. in the upper halfplane and k, points on the x-axis, 


0 otherwise. 


Remark on b) We shall prove b) by referring to the onedimensional ergodic 
theorem. This simple argument is insufficient to prove a.s. convergence. The 
reason for this is that the transformation n—n+1 only adds one point to the 
lower side of K,, but changes all points in the upper side. Still, one may prove 
a.s. convergence by showing that the fourth central moment of M, is O(n?). This 
longer argument is omitted. 


Proof of a) Write 
M,= ¥. I(N(x)=0)+ ¥ I(N(x)< 00, C(x) NOK, +9) 


xeKy, xeK, 
and repeat the argument in the proof of Theorem 2.3. 


Proof of b) Write M,= >  Y,(x), where 


xedKy, 


k 
y if x belongs to a finite (+) cluster with 
2 


Y, (x)= sical 
nl) k, points in K, and k, points in 0K,, 


0 otherwise. 


n—-1 
By symmetry it suffices to check that n~' )° Y,(i,0)— EY(i,0) in any L. 
i=0 


no—1 


Sy ¥,(,0)=n-""' 5 Y(i,0)+n-' ¥ (Y¥,@,0)— Y(i,0)) 
i=0 i=0 i=0 


n—ngo—1 


n—-1 
tn" Y (¥,,0)-Y¥GO)+n-* Y (¥,6,0)— YG). 
As Y(i,0)< N(i,0) it follows from Theorem 1.1 that the Y’s have moments of all 


orders. Thus by the onedimensional ergodic theorem the first term above tends 
to EY(0,0) in any L. In the third term 
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[¥(@,0)— ¥,@,0)| S2N(@,0) 1(NG@,0)2n5) 
and by the ergodic theorem 


lim sup ||third term||, <2 ||N(@,0)1(N(,0)2n))|,, 


n-—0o 


which is small for large ny. Clearly, the norms of the second and fourth terms 
tend to zero. 


3. Central Limit Theorems 


3.1. Some Lemmas. Lemma3.1 is a special case of Theorem 4.2, p.25 in [1]. 
Lemma 3.2 is Lemma 20.3, p. 172 in [1], adapted to the case of a twodimensional 
index set. Its proof is immediate. Lemma3.3 is a well-known result about m- 
dependent variables. Cf. e.g. [5], Theorem 19.2.1, p.370, where it is stated for the 
case of a one-dimensional index set. For the sake of completeness a proof is 
given, using Lemma 3.1 and 3.2. 


Lemma 3.1. Let {Y,}$° be r.v. such that for any integer u there is a partition Y, 
=X,,+6,,, Such that 


(i) X,,—*> X,, as n> © for u fixed. 
(ii) X,—*+ X, as u 00. 
(iii) limlim sup E 62,=0. 


Then Y, —4+ X, asn—. 


Lemma 3.2. Let {X(x)},..22 be a stationary process in L?. Suppose that EX(0)=0 
and ) |E(X(0) X(x))|=a?<oo. For a finite subset A of Z?, let S(A)= ¥. X(x). 


xeZ2 


xeA 
Then 


a) |A|~* E(S(A))? <a”, 
b) n-? E(S(K,))’> > 07 = ¥ E(X(0) X(x)), as no. 
xeZ 
Notation. For x,yeZ?, let ||x||=|x,|+|x,| and d(x,y)=||x—yl]. Let A,(x) 
={y: d(x, y)=n}. 
A process {X(x)},-z2 is called m-dependent if for all finite subsets A and B of 
Z? such that d(A, B)>m, the families {X(x)},., and {X(x)},., are independent. 


Lemma 3.3. Let {X(x)},.-z2 be a stationary, m-dependent process and assume that 
EX (0)=0, E(X(0))? <0. Then 


n-' > X(x)—*+N(0,07), as noo, where 
xeK, 


o? =) E(X (0) X(x)). 


Remark. o? <0 since the sum of covariances contains finitely many terms. In 
general, however, it may happen that o?=0. In this case the assertion of the 
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lemma could be sharpened. For example one may check that in this case 
lim E(S(K,))?/n exists. In the applications of the lemma which are to follow, 


unfortunately, I have been unable to prove that this pathological case does not 
happen. 


Proof. Divide K,, into smaller squares (side u) separated by channels of width m. 
Write for u fixed n=k(u+m)+s, 0<s<u+m, and let the union of the k? smaller 
squares be A,=B,, x B,, where 


B,={z: i(ut+m)Sz<i(u+m)+u, i=0,1,...,k—1}. 


Consider the partition n~'S(K,)=n—' Y X(x)+6,,=Xyntyn- It is easy to 
xeEAn 


verify conditions (i)-(iii) in Lemma 3.1. By m-dependence }) X(x) is a sum of k? 


xeAy 
independent sums, each distributed as S(K,). It follows that 


k-* ¥ X(x)—*> N(0, E(S(K,))’) 
xe€An 
as noo. Thus X,,—*+ N(0,02), as n->00, where o2=ES(K,)?/(u+m)’. This 
verifies (i). 
Secondly, it follows from Lemma 3.2b, that lim o2 =o’, which verifies (ii). 
Thirdly, by Lemma 3.2a), 
K,\A 
E62,=n-7E( ¥ X(or seat 


x€Ky\An n 


2 
Thus lim sup E62,< ( _ (—-| ) 6”, which tends to zero, as u—>0o. This 


verifies (iii) and by Lemma 3.1 n~' S(K,)—*> N(0, 07) as n— 00. 


3.2. Bounded Clusters. Lemma 3.3 leads immediately to limit theorems for cluster 


functions, which depend only on the spins in a bounded part of the plane. As an 
example one has 


Theorem 3.4. Let N,(k) be the number of (+>) clusters in K,, of size k which contain 
no point in 0K,,. Then 


n—*(N,(k)—n? P,/k) > N(— O;)s 
as n— 00, where 


Pp=P(N=k), og =k~? fe, CO = NG 8) — vi 


and the edge effect 4,=4 EX, where 


k 
KE if 0 belongs to a (+) cluster of size k with k, points 
2 


in the upper halfplane and k. points on the x-axis, 
0 otherwise. 
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Remark. The edge effect u, may be eliminated by assuming toroidal boundary 
conditions. This remark applies also in the sequel. 


Proof. Letting X(x)=k~'I (x belongs to a (+) cluster of size k), 


Y X@M=N,0)+ LY YC), 


xeKy xedKy, 


ky 


k . k, 
with k, points in K, and k, points in OK,, 


if x belongs to a (+) cluster of size k 
Y,(x)= 


0 otherwise. 


By Lemma 3.3 the left hand side converges (after norming) towards N(0,o7). By 
symmetry it then suffices to show 


n-1 
n-1 > Y,(,0)—+ EX, as no. 
i=0 
This is clearly true since {Y,(i,0)}?-* are 2k-dependent r.v. distributed as X. 
Example. For k=3, 
p3/3=2p*q'(2+4), 
3 =12p* q'(2+4), 
o3=2 p> q’(2+q)+4p° q''(1+27q+57q?—85q? — 123 q*—35q°). 


It is of course a difficult combinatorial problem to compute these parameters 
for large values of k. 


3.3. Unbounded Cluster Functions. In order to prove central limit theorems for 
the quantities treated in Sect.2 one needs to combine Theorem 1.1 and Lem- 
ma 3.3 using some truncation argument. 


Theorem 3.5. Let N, be as in Theorem2.2 and assume that p<, or p>p,. Then 


n—'(N,—n? K(p))—*> N(—p,07), 
as n— Oo, where 


o? =) C(N-'I1(N>0),(N(x))~! I(N(x) >0)) 


and p=4 EX, 


ky 


Pala 


if 0 belongs to a (+) cluster of size k with k, points 


in the upper halfplane and k, points on the x-axis, 


0 otherwise. 
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Remark. The condition on p looks unnatural in this context. 


Remark. Here and in the following theorems it will be clear from the proofs that 
a7 <0. 


Theorem 3.6. Let §, be as in Theorem 2.3 and assume that p<1, or p>p,. Then 
n(S, ao S(p)) N( —BH, 0”), 


o? = C(N’(0), N’(x) 


ky 
k, 
in the upper halfplane and k, points on the x-axis, 


if 0 belongs to a (+) cluster of size k with k, points 


0 otherwise. 


Remark. For p<n,, one may check rigorously that o?>0. In this case one may 
replace N’(x) by N(x) which is an increasing function of the (+) spins. Thus by 
the F.K.G. inequalities (cf. [7] Lemma 1, p.42) each covariance in the sum is 
nonnegative and at least one term is positive. 


Theorem 3.7. Let M,, be as in Theorem2.4 and let Y be the process defined there. 
Then a) For p>p, 


n~'(M,—n? P.)—*> N(4y,07), 


as n>, where »=EY(0,0) and o?=)\(P(0,x belong to the infinite cluster) 
— P?). 
b) For p<n, 


n-2(M,—4np)—2+N(0,47?), 
where y? =) C(Y(0,0), Y(i,0)). 
Remark. In a) one may check that o?>0 by noting that I(N(x)=0o) is an 


increasing function of the (+) spins. 
In the proofs of Theorems 3.5 and 3.6 we need the following: 


Lemma 3.8. Suppose E(N’)’ <0. Then for any e>0 there exists ng such that 
x IC(e(CO), g(C&))<e 


llx|| 20 
for all functions g(C(x)) such that 0<g(C(x))< N’(x). 
Proof. Applying the elementary inequality 
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|C(U, + V,, U, + V,)] $|C(U,, U,)| + EU? EV? + VEU? EV2 + EV; EV2 
to 
U, =g(C(0)) KCOOA is {O=9), 
2 
Uz=B(COH MCHA ust 4C=9), 
V, =g(C(0))—U,, 
V, =g(C(x))— U,, 


using that 


a) U, and U, are independent. 
b) EU?=EU}? SE(N’) <o and 


) EY =EV?sE [wy (n'2[5"])] souixi-9 


one gets 
¥ C(g(CO)), g(C(x))< ¥ 4k(+const-k-*/? + const -k~>) 
|||] 210 k=no 
which is less than ¢ if ng is large. 


Proof of Theorem 3.5. Consider (2.1). As in the proof of Theorem 2.4b) it is easy 


to see thatn=' ) Y,(x)—2>y, as n> 0. 
xe0K,y, 
It remains to show 


n—' ¥ (X(x)—K(p)) > N(0,0’), as n>. 


xeKy, 
Write X(x)—EX(x)=N(x)~! I(C(x)+0)—K(p) as Y,(x)+Y,’ (x), where 
¥/(x)=(N(x))~* I(C(x) +9, C(x) VA, (x) =9) 
— EN(x)~' (C(x) +0, C(x) 0 A,(x)=9). 


The rest of the proof is to apply Lemma 3.1 to the partition 


a ¥ (X(x)— K(p))=X,4+5 yn, 


xeKy, 


X,,=n-" Y YQ). 


xeKy, 
Since {Y{(x)} are 2u-dependent it follows from Lemma3.3 that 
X,,—*? N(0, 02), as n— 00, where 
0, =>, C(¥,(0), ¥/()). (3.2) 


This verifies (i). 
Secondly, lim o? =o, since we have termwise convergence in (3.2) and the 


sum (3.2) converges uniformly in u by Lemma 3.8. This verifies (ii). 
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To verify (iii) it suffices by Lemma 3.2a) to show that 


lim Y|C(¥."0), ¥2"(x))| =0. 


u—0OO x 


Here again termwise convergence is immediate and the sum converges uni- 
formly in u by Lemma 3.8. 

Hence Lemma 3.1 applies and (3.1) is proved. 

The proof of Theorem 3.6 is omitted since it is almost the same as that of 
Theorem 3.5. 


In the proof of Theorem 3.7 one needs to replace Lemma 3.8 by the following 

Lemma 3.9. Let I(x)=1I(N(x)= 00), I,(x)=I(C(x) 0 A,(x) +9). Then, for p>p,, 

C(I(O), I(x) 

0=),CU), TC) S2 EC rygy {O)—1O) SOx) 
2 

C(I,(0), 1,(x)) 
uniformly in u for any r. 
Proof. The right relation follows from Theorem 1.1. as 

E(I,(0)—1(0)) S$ P(N’ 2u). 


The left inequalities follow from the F.K.G. inequality since I(x) and I,(x) are 
increasing functions. 


Concerning the middle inequalities, suppose first that u= lest Then 


2 
—_ 2 
E(I,(0)1,(x)) SEU [ xu 79) Truzu {O)=(E I ust) 
by independence and 


C(,(0), 1,00) S(E Tp yxy 40)? -(E 1,0)? $2 EC yxy (0)—10)). 
cvs ] 


The same relation holds for C(J(0),J,(x)) and C(J(0),1(x)). If u<[] the 
lower covariance is 0 while 


E(I,(0) 1(x)) SEU) Te yey ON) =E 1,0) E Te yy 4) 
rd CJ 


and 
C(I, (0), 1(x)) SEI,(0)- ET ras {O-E1,0) EIO)SEC usu jO)—10). 


This shows the middle inequality. 
Proof of Theorem 3.7a). Start from the partition 


M,= ¥ I(x)+ ¥ Y,(), 


xeK, xedK, 
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where I(x)=I(N(x)=00) and Y,(x) was defined in the proof of Theorem 2.4. 


Since it is easy to check that n~' * - ¥,(x)—*> 4p, it remains to show that 
xe0K, 


n— ¥ (I(x)—P,)—* N(O, 0”). 
xeK, 
Write I(x)—P., = Y/(x)+ Y," (x), where Y/(x)=I,(x)— E1,(x) and I(x) was defined 
in Lemma 3.9. 
From here on, the arguments in the proof of Theorem 3.5 may be repeated 
almost literally, referring to Lemma 3.9 concerning the uniform convergence. 


Proof of Theorem 3.7b). In this case we are to show that 


n-* ¥ (Y,(x)-—w)—> N(0,47?), as n> 00. (3.3) 
xe0K, 
Extend the definition of Y(x) in Theorem 2.4 in the natural way to all x in OK,,. 
Then, one may drop the indices of the Y’s in (3.3), as 
Gl 
E|n-* ¥ (¥,(x)—Y(x)|<8n-* }' ELY,(,0)— Y@,0)| 
i=0O 


xe0K, i= 


<16n-+* 3 EN(i,0)1(N(i,0)=3), 
i=0 


which tends to zero since EN? <0. 
Introduce Y/(x)= Y(x)I(C(x) 0 A,(x)=9) and Y”(x) by 


¥ (x)= ¥.(x) + ¥,"(x). 


Let further J, , be those points in 0K,, which are at a distance no less than 2u 
from any corner of K,, and consider the partition 


n-* Y (¥@)-—w=n* Y (Y@-EX(~)+n7* YY (¥@)-w) 
xe0K, xeJn,u x€0Kn\Jn,u 
+n-* > (¥,'(x)— EY," (x)) = X pnt Sun =X un $y + 5. 


un 
xeJn,u 


We shall apply Lemma 3.1 to this partition. 
X,, can be split into four independent terms and the one-dimensional 
analogue of Lemma 3.3 may be applied to each part. Thus 
Xun—*> NO,4 70), 
where 


Ya =>, C(¥, (0,0), ¥/(i, 0). 


This verifies (i). The sum further converges uniformly in u by Lemma 3.8 and it 
follows that 


lim y2 = C(Y (0,0), ¥(,0))=7?, 


uo i 
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which verifies (ii). To verify (iii) it remains to check that 


lim limsupE 6%’ =0 for v=1,2. 


uo n-0o 


For v=1 this is immediate. For v=2 it suffices by Lemma3.2a) to show that 


lim lim Y|C(¥,"O,0), ¥;’,0)|=0 


u-co Mm—-0O j 


and this follows as before since the sum converges uniformly in u by Lem- 
ma 3.8. 


Acknowledgement. | wish to thank C.G. Esseen for his most valuable support 
and advice during the preparation of this manuscript. 


Note. Results similar to ours have independently been obtained by G.R. 
Grimmett (preprints: “On the differentiability of the number of clusters per 
vertex in the percolation model” and “Central limit theorems in percolation 
theory”) and T. Cox (personal communication). 
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Convergence of Non-Ergodic Dynamical Systems 


Olav Kallenberg 
Mathematics Department, CTH and GU, S-41296 Goteborg, Sweden 


Summary. Let {T,} be a flow on a probability space (S,.% uw) which describes 
the time evolution of a dynamical system with state space S, and interpret pu 
as the initial distribution of the system. Then the distribution of the system 
at time ¢t is given by »7,~'. Our aim is to study the asymptotic behavior of 
HT,~' both in general and in the particular cases of random rate and almost 
periodic systems. The results seem to indicate that convergence or mean 
convergence is the normal behavior in the non-ergodic case. 


1. Introduction 


Let J ={T,,teR} be a flow acting on a probability space (S, Xu). The aim of 
the present paper is to study the asymptotic behavior of wT,~', the image of pu 
under T,. Very little seems to be known about this problem, except in the special 
case when yp is absolutely continuous (<) with respect to some probability 
measure fi which is stationary and ergodic under 7. 

The above problem arises in the study of dynamical systems as follows. Let S 
be the space of possible states of the system, and let 7 describe the time 
evolution, which is supposed to be purely deterministic. Uncertainty (and hence 
probabilities) will typically enter (and can only enter) into the picture through 
an incomplete knowledge of the initial state. Our knowledge about the system at 
time t should then be represented by a probability distribution y, on (S, 7) 
rather than by a specific state x,eS. Note that the law of evolution x,= T,X, 
induces the consistency relation y,=T,~', where 1= py is some fixed probabili- 
ty measure. For the kind of systems considered in statistical mechanics, one may 
assume that t is large (in the sense that, on an average, x, has little resemblance 
with x,). It is then natural to represent the actual uncertainty by the limiting 
measure ji, provided that , converges in some sense. 

In classical statistical mechanics, certain invariant and ergodic measures fi 
derived from the so called Liouville measures were introduced as an artificial 
means for calculating certain time averages. More recently, there has been some 
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interest among physicists to look at fi probabilistically as a basis for examining 
the micro-behavior of the system. It has been argued that this approach makes 
sense only when [arises as the limit of w7,~' for arbitrary «<j, which requires 
that f@f be mixing, (cf. [12]). However, it is usually enough that mixtures of 
Liouville type measures arise in the limit, and if this can be shown to occur 
under broad conditions on p, we have in fact arrived at a probabilistic justifi- 
cation of statistical mechanics. (The fact that statistical mechanics leads to the 
“right” conclusions about the macro-behavior will of course remain the prin- 
cipal justification of the theory.) 

When looking at y as the uncertainty at time 0, there is no reason to assume 
ergodicity, in the sense of restriction to an invariant surface. (In fact, invariant 
quantities like the total energy can only be measures with limited accuracy.) 
Indeed, one would rather expect a fairly smooth distribution over the class of 
invariant surfaces. On a first thought, one mighi believe that the asymptotic 
behavior of 1 7,~' would only reflect what happens in the ergodic case. Our 
results in §§3-4 seem to indicate, however, that the asymptotic behavior of y, 
depends less on the properties of the individual ergodic components than on the 
way in which these are mixed together, and that convergence or mean con- 
vergence (see below) is in fact the typical behavior in the non-ergodic case. It 
will further be seen how the Liouville type measures do arise as typical limits. 

The present paper is a rather disconnected collection of some fairly elemen- 
tary results obtainable by different methods, which have all bearing on the 
general problem described above. (The task of developing a systematic theory is 
left to specialists who may find the problem challenging.) We turn to a brief 
description of the contents of the paper. 

In §2, no specific assumptions are made about the system. Our main result 
here is Theorem 2.2, where we assume that w<some jf, a fixed stationary 
distribution, and give necessary and sufficient conditions for mean convergence 
of , towards some p,,, in the sense that 


lim ~ {|u,B—1,,B| dt=0, Be, (1) 
0 


rao. 


In the special case when fi is ergodic, (1) is equivalent to weak mixing of fi, and 
our result reduces to the classical mixing theorem (Theorem 1.10 in [18]). We 
shall make use of the fact that, in the present case, convergence of yT,~' is a 
special case of Rényi stability [17]. The difference in behavior between ergodic 
and non-ergodic systems reflects the fact that Rényi stable systems cannot in 
general be decomposed into Rényi mixing ones, contrary to what is claimed in 
[17], p. 301. 

The interest of §2 is purely theoretical, and in order to obtain more explicit 
and practically applicable results, it seems necessary to make specific assump- 
tions about the system. In §3 we study a rather broad class of systems, to be 
called random rate systems. For these, S is assumed to be a product space U x R, 
where the real components is invariant and prescribes the rate at which a 
trajectory in U is traversed. The model is more general than it might appear at 
first sight, and it includes many systems of practical interest, such as e.g. systems 
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of rigid bodies interacting by elastic collisions (e.g. billiard systems), and systems 
of particles interacting by gravitational forces. (In the former case, the motion 
will be the same except for speed, if all velocities are multiplied by a constant. In 
the latter case, an increase in velocities by a factor c>0 has to be compensated 
by a reduction in size by a factor c~/*, in order that the evolution of the system 
should remain the same apart from size and speed. Cf. Kepler’s third law!) We 
shall prove two results on random rate systems to the effect that convergence or 
mean convergence takes place whenever the rate distribution is sufficiently 
smooth. 

In §4 we make a further specialization to what we call almost periodic 
systems. For these, the “positions” are given by a point on a torus, and the 
motion is at constant rate in each coordinate. By a suitable parametrization, 
many systems of practical interest reduce to special cases. In particular, the 
model applies in an obvious way to systems of non-interacting particles enclosed 
in rectangular (or even spherical!) containers with reflecting walls. One may 
further think of planetary systems, systems of harmonic oscillators [12], and 
stochastic processes with discrete but random spectra (cf. chords in music!). 
Mathematically, almost periodic systems are simple enough to admit a complete 
and explicit treatment, leading in Theorem 4.1 to convergence criteria in terms 
of smoothness assumptions on the joint velocity distribution. The present results 
are similar to those obtainable in the much more complex case of infinite 
systems of free particles moving in the entire space (cf. [6, 8, 10, 15, 16]), and 
hopefully they may be helpful for a better understanding of the latter. (Note, 
incidentally, that the present finite systems may be converted to infinite ones by 
a periodic continuation.) 

In §5 we consider finite systems of particles which move independently 
according to a common flow 7 ={T,}. Write Z,,={T,"} for the induced flow for 
systems of m particles. Suppose we can prove convergence or at least asymptotic 
invariance of uT,~', i.e. in the one-particles case, for some broad class of initial 
distributions p, e.g. by using the methods of §§3-4. The problem is then to 
deduce, under suitable conditions, that convergence takes place in the m-particle 
case also, i.e. for y,,(T,")~'. It is easy to give a counter-example showing that the 
latter convergence is not automatic, in the sense that convergence of the 
marginals need not imply joint convergence. Theorem 5.1 states conditions 
under which this implication does obtain. The natural framework for the 
problem seems to be the theory of random measures, and especially of the 
Papangelou conditional intensities. The present theorem is closely related to 
certain results in the literature for infinite systems [6-10, 15-16], and again the 
results in the finite and infinite cases illuminate each other. 

The conditions needed for convergence below and in [10] are typically 
regularity assumptions on the measures involved which are weaker than ab- 
solute continuity. In particular, the notions of local invariance and the Riemann- 
Lebesgue property play an important role in the present context. In an 
appendix, these properties are shown to induce a refinement of the classical 
Lebesgue decomposition of measures. We further show by an example that the 
Riemann-Lebesgue property is strictly weaker than local invariance. 

We conclude this section by introducing some terminology and notation. 
Given a flow 7 on S, we define the flows Z% and J” on S? by F%,={T,xT,: 





332 O. Kallenberg 


teR} and 7” ={T, x T,: s,teR}. If 7’ is a flow on U, then the same symbol will 
be used for its natural extensions to product spaces of the form U x V. The o- 
fields of 7 - and 7,-invariant events are denoted by ¥ and .F, respectively (with 
the same affixes if any), whereas £7 and 7 will denote the product o-fields 
generated by % and Y respectively. If y is stationary (i.e. 7 -invariant), then 7 
or p is said to be ergodic whenever ¥ is p-trivial in the sense that % = {0, S} as. 
(i.e. apart from p-null events). Our notations for conditional distributions follow 
the pattern y(-|.4). Further note that uf ={ f(x) u(dx), while by definition fu<p 
with density f, and Bu=1,u. Moreover, (fog)(x)=f(g(x)) while (f x g)(x, y) 
= f (x)g(y). A class @<¥ which is closed under finite intersections and satisfies 
o(@)=F is called an intersection class generating S&. 

Let us write #=A(S) for the class of bounded measurable functions 
S—R,,, and (whenever the space S is topological) write A and ¥, for the 
subclasses of continuous ¥-functions and of A)-functions with compact support 
respectively. The Borel a-field in S is denoted by @(S). Note that the set-wise 
convergence y,—p implies u,f—puf for all fe% By the weak convergence 
L,—* w we mean that p,f— uf for all feA%. The class of measures on S (locally 
finite when S is a topological space) is denoted by .@(S), and by .V(S) we denote 
the subclass of probability measures. For real numbers c, and c, and for any 
sequence {A,}<.4(R), we write c,—>c(A,) to denote the mean convergence f|c, 
—c|A,(dt)—+0. Moreover, n,— u(A,) will mean that yp, f—pf(A,) for all feF, 
and similarly for n,—*> p(A,). Note that (1) is equivalent to uy, y,,(A,) with 4, 
taken to be the uniform distribution over [0,n]. Replacing (A,) in our notation 
by (#) for some class Y of sequences means that there is mean convergence for 
all {A,J}eEF. 

In particular we introduce the class A of sequences {/,}<.W(R) such that 
\|A, *6,—A,|| +9 for every aeR. (Here 6, denotes the Dirac measure at a, while 
|-|| denotes the supremum of the absolute value both for point and set 
functions.) The more extensive class of sequences {/,}<¢.W(R) such that 
{4,*y}eEA for all absolutely continuous ye.V(R) is denoted by A,,. (Cf. [14], 
§§11.8-10, for some basic properties of A and A,,.) A bounded measure A4€.4@(R) 
is said to be locally invariant (cf. [10]), if the sequence 1,(dx)=A(dx/n)/AR 
belongs to A,,. For unbounded measures, the same property is required for all 
restrictions to finite intervals. 

Some further conventions are to write N for the natural numbers and Z for 
the integers. Whenever random elements are considered, the underlying proba- 
bility measure is denoted by P and the corresponding expectation by E. 
Convergence in distribution (in the sense of [1]) is denoted by —4>, and by 
E,—4+ &(A,) we mean that P€>1—*+P &-'(A,). When we write feL,(w), it is 
understood that f >0. 


2. Some General Results 


In this section we make no specific assumptions about the nature of our 
dynamical system (S,“%7), except that some regularity will occasionally be 
required. For the reader’s convenience, the exposition is elementary and avoids 
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the sophisticated techniques of e.g. Koopman and von Neumann [11] (which 
might shorten the arguments slightly). We further include full proofs of some 
simple auxiliary results which may be well-known but are not easily accessible 
in the literature. 


The following lemma explains why the asymptotic behavior of yT,~' is 
related to properties of the flow J, on S?. 


Lemma 2.1. Suppose that ve V(S) and {1,,} < W(R), and define 


K,=JUT,'7 2,(dt),  neN. (1) 


(i) If Uo¢N(S) is Z-invariant and p<, then uT,~'— v(A,) iff xk, v?. 
(ii) If S is a separable metric space, then nT,~' —*+ v(A,) iff K,—% v?. 


The condition x,—“»v? admits the following interpretation. Consider two 
non-interacting particles in S which both move according to 7, and choose their 
initial positions independently according to yw. Let &, denote the state of the 
system at time t, and introduce random variables t,, which are independent of 
{é,} with distributions /,, ne¢N. Then x,—*> v? states that the particle positions 
of €,, are asymptotically independent with distribution v. 


Proof. Suppose that x,— v?. Then, in particular, 
§(uT,-*)A,(dt)—y, 
so by Jensen’s inequality we get for any AeY 
({\uT,-! A—vA| 4,(dt))? $f(uT,-! A—vAy? A,(dt) 
=f(uT,-' A)? 4,(dt)—2vAfuT,-'AA,(dt)+ (vA? 0, 


which proves that w7,~'—v(A,). Similarly, x,—*>v? implies y7,-~!—*% v(A,) 
when S is topological. 
Suppose conversely that »T,~'— v(A,). Then 


f(uT,-!A—vA)(uT,-'B—vB)A,(dt)>0, A, BeY, 
and by expansion of the integrand and cancellation of terms, this becomes 
k,(A x B)>v?(AxB), A,BeS% (2) 


If 4=gyy for some invariant distribution po, we further get for any r>0 and 
Cef? 


K, C=J(uT,-*)? CA, (dt)=J((g Ho) T,~')? CA, (dt) S(uf{g>r})? +r? UG C, 


and here the right side is independent of n and tends to zero as C|@ and then 
r— oo. Thus it may be seen by a monotone class argument (cf. e.g. A2.1 in [7]) 
that (2) extends to arbitrary sets in Y”, as desired. If instead 1 T,~'—*> v(A,), we 
get in place of (2) 


K(f xg)>v (fxg), fgeFo(S), 


and when § is a separable metric space, this extends to arbitrary functions in 
¥F,(S”) by Theorems 2.1 and 3.2 in [1]. O 
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Our next result characterizes mean convergence of T,~' in the absolutely 
continuous case, and extends the classical mixing theorem. We assume the space 
(S,Y) to be such that any conditional distributions exist on S. Given any 
pe WN (S), let fie VW (S) be defined by 


ji(A x B)=Ju(A|4)u(B|A)dp, A, BeF. (3) 


Theorem 2.2. For 7 -invariant we.N(S), the following statements are equivalent: 
@ I,=F7 as. p’, 
(i) H(+| F(x) x u(+| F)(y) is F,-ergodic for (x, y)eS? ae. p’, 
(ii) (gu) T,~*— A(- x g)(A) for all geL,(p), 
(ii’) u(BOT,~'B)— some c,(A,), BE@, for some {A,}€A and some intersection 
class @ generating 


The class A was first used in ergodic theory by Brézis and Browder [3]. 
According to our theorem, the statement that (g)T,~' be 1,-mean convergent 
for all geL,(u) is simultaneously true or false for all {A,}eA. Since L,- 
convergence implies a.s. convergence of a subsequence, (ii) is seen to imply 


M(AQT,-'B)—>ji(AxB), A, BeF, (4) 


for almost all sequences of t-values which grow rapidly enough (in a suitable 
sense). The corresponding sequence of events T,~' B is by definition Rényi stable 
with local density u(B\.4), (cf. [17]). 

If u is ergodic, then ¥ and ¥? are trivial, and (i) reduces to the classical 
condition that p? be ergodic. For general p it is seen by dominated convergence 
that J,-ergodicity of (u(-|.4)(x))* for xeS a.s. u implies mean convergence. This 
should be compared with the present necessary and sufficient condition (i’) 
(which is strictly weaker, cf. §§3-4 below). 

Our proof of Theorem 2.2 will be based on three simple lemmas. 


Lemma 2.3. For any {/,}¢A, there exist positive numbers t,— 00 such that the 
uniform distributions y, on [0,t,] satisfy 


|A,—An* Ynll +9. (5) 
Proof. By 11.8.3. in [14], ||A,—A,* || 0 for every we. W(R). Writing pu, for the 


uniform distribution on [0,t], we may hence define a sequence {n,} by n,=1 
and 


n,=min{m: sup ||A,—A, * 1yl|<k~"}, k=2,3,.... 


This sequence is clearly non-decreasing, and moreover n,— 00 since A, +A, * Ly 
for all n and k. Putting 


mSn<n,,, keN, 


it is easy to verify that t,—> 00 and that (5) holds with y,=4,.. 
Lemma 2.4. If is Z-invariant and {1,}¢A, then 


Ju(g(feT))A,(dt)> fxg), feF, geL,(u). 
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Proof. Since both sides of (6) are linear in g and bounded by || f|| ug, it suffices to 
consider bounded g. It is further enough to prove (6) for ergodic p, since the 
general result will then follow by Fubini’s theorem and dominated convergence: 
Julg(foT))A,(dt)=J4,(dt)fu(g(foT)| 4)du=Jdyuj u(g(foT,)| 4), (dt) 
— Ju(f| 4) u(g| 4)du=f(f xg). 


Letting y, be such as in Lemma 2.3 and writing 1, =A,*y,, we get by Fubini’s 
theorem and the L, ergodic theorem 


Julg(foT,)) A,(dt)—wf ugl=|fgdul(feT,) 4,(dt)—uf ug 
=|fgdulf(foT)4,(dt)—uf]| 
=|fgduf4,(ds)[f(foT,)y,(du)—uf]°T, 
Siig fduf2,(ds)|f(foT,)y,(du)—uf eT, 
= Ig f4,(ds)fdulf(foT,)y,(du)—uf eT, 
=IIgllfdulf(foT,)7,(du)—nf|—0. 


ju(g(feT)) 4,(dt)— uf ug=R(f xg), 
and (6) follows by means of (5). 


Hence 


Lemma 2.5. For any yp and JF, 


wW(fxglF)=uf|A)xulgl 4), LgeLl,(u), as. v?. (7) 


Proof. Since the right-hand side of (7) is %?-measurable, it suffices to verify that 


Slat f19) x wlgl #)du? = [(fx9)dn?, le F?, (8) 


and by a monotone class argument, it is enough to take ]=/, x1, for arbitrary 
I,, 1,¢4% But in that case, (8) reduces to the identity 


fu(f|Adufulgl*du=lfduf[eduh O 

Ii I2 I I2 
Proof of Theorem 2.2. By Lemma 2.5, (i’) implies that y?(-|.47) is Z-ergodic a.s. 
yw. For arbitrary I¢.%, it then follows that p?(I|.%?)=0 or 1 as. Let I’ be the 
J7-set where y?(I|.£7)=1, and note that 


wWIAI')= fw (I| 47) dp? =0, 
le 


and similarly for p?(I‘qI'). It follows that I'=I as. so %,<. 4%? as. The 
converse relation being trivial, this shows that (i’) implies (i). 

Applying Lemma 2.4 to y?, we next obtain for any f,, f,¢F% and g,, 
82€L, (u) 


fuga °T,)) H(g2(f2°T,)) A,(dt)— fu? (f, xf2| 42) uw" (g, x g|%)dp’. (9) 
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Assuming (i), we get by (9) and Lemma 2.5 for any feF¥ and geL, (u) 


f(u(g(feT)))? A,(dt) fu? (f xf 1 42) u7(g x g| 4.) dw? 


se (fxf|I*)w (gx el F?)dw?=[fu(f| 4) uel A)dul? =(H(f x 8)’, 
and (ii) follows as in case of Lemma 2.1. Since (ii’) follows trivially from (ii), it 
remains to prove that (ii’) implies (i’). 

To see this, let {A,}, {c,} and @ be such as in (ii’), and let {t,} be independent 
random variables with distributions {/,}. Writing 4,=P(—t,)~' and using the 
invariance of u, we get for any Ae@ and distinct k, neN 

E|m(T,7'AOT,{' A)—c,|=E|M(ANT,2*,,4)—¢ul 
=f |M(ANT,-'A)—c4l(%#,)(d0) 


and since ||/;,* /,,—A,||>0 for fixed k, the left-hand side tends to zero as n- 00. 
For fixed Ae€@ and any subsequence N’CN, there hence exists a further 
subsequence N” < N’ such that 


lim p(T,-' ANT," A)=cy, keN, as. 


neN”’ 


By Theorem 3 of Rényi [17], this yields 


M(BOT,-'A)>v,B, Be& as. (neN”) (10) 


for some random probability measure v, on S, and by Kolmogorov’s 0-1 law, v, 
is in fact a.s. non-random. Since N’ was arbitrary, it follows from (10) by 
dominated convergence that (along N) 


u(BOT,-'A)>v,B(A,), Ae, BEY (11) 
and in particular 
fu(BoT,-'A)4,(dt)>v,B, Ae@, BEF, 


so by (6), v,B=/i(A x B). Inserting this into (11) and using a monotone class 
argument, we obtain 


W(BOT,-' A) fi(Ax B)(A,), A, BES. (12) 


Proceeding as in the proof of Lemma 2.1, it is seen from (12) that, for any A,, 
A,, B,, BLES, 


fu(A, OT,-*B,) u(A, OT, !B,)A,(dt)—> f(A, x B,) f(A, x By), 
so by (3), (7) and (9), 


fu2(A, x A,| 4.) u?(B, x B,| ¥,)du? =[u?(A, x A,| 7) w2(B, x By| 42) dp2. 


By a monotone class argument, this extends to 


fu? (Al 4.) u?(B| 4,)d uw? =f u?(A| 47) u?(B| F7)du?, A, Be ¥?, 
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and we get in particular 
§u2(A| I)? dy? =[2(A| I?) dw, Ae H?. 
Writing €=y(A|.¥%,) and using the fact that #7 <.¥,, we hence obtain 
w?(E—p? (S| F7))? =p? F? — 2 w? (Ew? (2 | ¥7)) + w? (w?(E| F7))? 
=p? & — pw? (u?(é| 47)? =0, 
C=w(E|F?)=p?[w?(A| 4,)|F7J=w7 (ALF?) as. p?. 


Thus p?(-|.47)=y?(+|%,) as. and since y?(-|.4,) is trivially J-ergodic, (i’) 
follows by Lemma2.5. 0 


Our next result shows that, in the absolutely continuous case, a compound 
system is mean convergent whenever its component systems are so, and that the 
limit is then composed by ergodic product measures. 


Theorem 2.6. For any 7 -invariant p, and p14, (g(u, x H2))T,~' is mean convergent 
iff (g;u;)T,~* is so for all g,EL,(u,), i=1,2, and then %,=%? a.s. pw, X My. 


Proof. Suppose that (g; u;)T,~' is mean convergent for all g,¢L,(u,), i=1, 2. Then 
so is (g(u,+u,))T,~* for all geL,(u, +p,), and it follows from Theorem 2.2 that 
I, =F? as. (u; +2)’. Since 

(Hy + Ha)? = ME + (Hy X Ma) + (He X Hy) +43, 


the last assertion follows. 
For arbitrary A,, A, B,, B,eS and {1,}€A we next obtain 


flu (A, AT,~* By) u2(A, OT, ~' B,)— fi, (A, x By) (A, x B,)| A, (dt) 


SY Jluil4;oT,~* B)— f(A; x B)| A,(dt)> 0. 
i=1 
Since clearly 


u(A;OT,~* B) v f(A, x B)SH,A,Au,B;,  i=1,2, 
it follows by a monotone class argument that 
J \(4y x #a)(AOT,~"B)—(fi, x fi,)(A x B)|4,(dt)>0, A, BeS?, 


which is easily extended to 


Sls x H2)(g(F° T)) — (Ais x HF g)| A,(dt)— 0 


for arbitrary feF(S*) and geL,(u, x u,). Hence (g(u, x w,))T,~' is mean con- 
vergent for every geL,(f, X L). 
The proof in the converse direction is trivial. ([ 


We next turn to the case of arbitrary initial distributions. Given any 
sequence {A,}<.W(R), we define the corresponding averages f and fi of func- 
tions and measures f and pv on S by 


\(foT)A,dt)—>f,  fuT,-'4,(dt)—*>a, 
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whenever these limits exist. For any xeS, write 7x={T,x,teR}, and let Ix 
denote the closure of 7x. A Borel set in S is said to be uniquely ergodic if it 
supports exactly one 7-invariant probability measure on S, (cf. [18], p. 135). 


Theorem 2.7. Let S be metric and separable, and let and {A,} be arbitrary. Then 

UT,~'—*+ some v(A,) iff yu? exists and equals ji”, and then v=ji. If S is further 

compact, then uT,~'—*+ ji(A) provided 7x x Ty is uniquely ergodic for (x, y)eS* 
2 

a.s. pl’. 


Note that A may be replaced by A,, in the last statement, provided that 7 is 
uniformly continuous in the sense that T,x— x as t-—0, uniformly in xeS. 


Proof. By definition of mean convergence, 1 T,~'—*> v(A,) implies that v=, and 
so the first assertion holds by Lemma 2.1. 
Let us next suppose that S is compact and that the last condition is fulfilled. 


Fix x, yeS such that Z7xxZy supports a unique 7,-invariant probability 
measure m,, on S*, and note that the marginals m, and m, of m,, are 7- 
invariant and supported by Fx and Ty respectively. Since m,xm, is Z,- 
invariant with support in TxxTy, we get m, ,=m, x m,. Since moreover m, , is 
unique, m, and m, are the only 7-invariant distributions on Tx and Ty 
respectively, so Tx and Ty are uniquely ergodic. 
Let us now define 
H,=J57,,4,(dt), neN, 


and note that {y,} is weakly relatively compact since S is compact, (cf. 
Theorem 6.1 in [1}). If u,,—*> Up (nEN’) for some subsequence N’CN, then pg is 
supported by 7x, and it is further seen that, for any fe¥, and heR, 


lHoT, 'f—Hof | “— JF (Tsu) An(dt)— Jf (T,x) 4,(d0)| 
=i IJ F(T.x)(, *4,)(dt)— J f (T,x) 4,(d0)| 


S || f || limsup ||6, * 4, —A,|| =0, 
néeN’ 


which proves that wy is 7-invariant. Thus 4y=m,, and since the limit is hence 
independent of the choice of N’, it follows that uy, —*+ pM along N, which means 
that f(x) exists and equals m, f. Similarly f(y)=m, f, and for any f, ge FZ, 


(Fx g)(x, y=m,, (fx g)=(m, x m,)(f x g)=m, fm,g =f (x)8(y), 
so fxg exists a.s. 2, and 


fxg=fxz as. pw, fgeF. (13) 


It follows in particular that f exists a.s. p. 


For any fe¥, we now obtain by dominated convergence and Fubini’s 
theorem 
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uf=Jdp lim f(foT, A,(dt)= lim fduf(feT) A,(dt) 
= lim f4,(dt)f(foT) du, 
so fi exists and satisfies - 
Af=uf, feF.. 
Applying (14) to both p and y?, and using (13), we get for any f, geF, 
w (fx g)=w(fxg)=w(Fx 2)=nF ue=Af ig=i (fxg), 


which clearly implies that n?=77. 


3. Random Rate Systems 


In this section we specialize to dynamical systems such that S=U x R for some 
U while 7 is given by 


T,(u,r)=(T;,u,r), ueU, reR, (1) 


for some flow 7’={T,,teR} on U. As in §2, we first consider the absolutely 
continuous case. Let A denote Lebesgue measure on R. 


Theorem 3.1. Let u=p' x A, where p'e.N(U) is Z'-invariant. Then 


(gu)T,- "fi fxg), feF, geL,(y). (2) 


Proof. As in case of Theorem 2.2, it suffices to prove that u(A,T,~'A,) 
converges for all A, and A, belonging to some intersection class generating % 
and we may hence take A;=B; x I, where B,, B,e@(U) while I is a finite real 
interval. Writing #’ for the o-field of 7'-invariant events in U, we get from 
Lemma 2.4 


MA, OT,-*A)=Ju(B, OT," ' B,)dr— Al W'(B, x B,) 
I 


=AI fu (B,| F') u'(By| Fd 


=f u(A,| 4) u(Ay| A) du=AlA, x Ad), 
as desired. [J 


Our next theorem is similar to some results for free particles in §§4 and 6 of 
[10]. Here a weakening of the requirements on yp is compensated by a strength- 
ening of those on S and JZ. Let 2 denote the projection of S onto R. By a 
rotation of a group U we mean a mapping T such that Tu=u+v for some fixed 
veU. Let 4, and Mp, denote the classes of locally invariant and diffuse 
bounded measures on R. 


Theorem 3.2. Let U be a compact Abelian topological group, and let 7’ be a 
continuous flow of rotations on U. Then uT,~'—*+some fi if un~'eM,,, while 
HT,'—* some fi(A,,) if wn-"€My. 3 
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Proof. Let H be the normalized Haar measure on U. Since J commutes with 
the group operation on U, it is clearly enough to prove weak convergence of 


((v x 59) *H)T,-'=(vx 50) *(uT,~*), — teR, 


for arbitrary ve. W(U) such that v<H with continuous density. But for such v it 
is easily seen that 


(vx d9)*u<H xpn-'=po, 


and that the corresponding density is bounded and U-continuous. For con- 


venience, we may thus assume that u<py and that g=dy/d py is bounded and 
continuous. 


In the case px~'€.M,,, we have to prove that 


(Ho) T,'f—fig( fxg), feF.. 


As in the preceding proof, it suffices to verify this for indicators of sets A;=B; 
x I, where B,e@(U) with HOB,=0, i=1, 2, while J is a real interval. Defining 


A(ds)=I(un-')(ds/t), s,teR, 


we get as before 


H(A, OT,~' A,)=[H(B, OT," B,) ux-' (dr)=JH(B, OT, ~'B,) Ads). 
I 


Since H0B,=0 and since 7’ commutes with the group operation, we may 
approximate the last integral uniformly in t by writing y x /, in place of /,, where 
ye N(R) with y<H. But since {y */,/A,R}€A, we get by Lemma 2.4 


H(B, OT, ~'B,)(y*4,)(ds) > un" 1 A(B, x B,)=fig(A, x A,), 


as desired. 

Turning to the case of general yx~'e€.Mp, note that the absolute continuity 
of implies that the conditional distributions of 1 induced on R for fixed ueU 
are diffuse as well. By dominated convergence, we may thus assume that y is 
supported by {u} xR for a fixed weU. But then yT,~' is supported by Z’uxR 
for every teR, and since 7’ u is itself a compact Abelian group, we may assume 
without loss that it exhausts U, i.e. that U is minimal, (cf. [18], p. 114). Note that 
U is then uniquely ergodic, (cf. [18], p. 138). 

We now introduce the auxiliary random measure =6, * » on S, where o is a 
random element in S with distribution H x69. Letting 1,,17,,... be random 
variables which are independent of o with distributions 1,,/,,..., we next define 
n,=¢T,-*. We shall prove below that 4,—*+ 1. Since y, is normalized and 
hence uniformly integrable, it will follow that y,—“> yu, in L,. Let N’< N be any 
sequence such that the latter convergence holds a.s. along N’, i.e. 


(5, * M)T,—'—*+ po as. (neN’). 
Then 
UT, '—*>6_,*Uyp=Up as. (nEN’), 
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so 
UT,~*—*>Uo(4,)  (nEN’), (3) 


and since the last limit is independent of the choice of N’, (3) must remain true 
along N, (cf. A1.2 in [7]), as asserted. 

To see that 7, —*> wg, note that {n,} is relatively compact in distribution (cf. 
Lemma 4.5 in [7]). By Theorem 2.3 in [1], it thus remains to show that every 
distributional limit 4 equals yy a.s. For such an yn, note that yxn~'=pn-' as. 
since n,~'=yn~'. Further observe that the 7’-stationarity of & carries over to 
each y,, and hence to y. Finally conclude from the hypothesis on {A,} that 7 is 
J -stationary also, (cf. the proof of Theorem 6.1 in [10]). 

These three properties of y imply that En? has projection (uz~')? on R? and 
is both J,- and Z,-invariant. By Krickeberg’s Corollary 2 in [6], p. 78, it follows 
that En? admits a disintegration of the form 


En?=|v.(un—') (dz) (4) 


for some J,- and 7;-invariant measures v,eW(U? x {z}), z€R?. By Fubini’s 
theorem, (u2~')? D=0 where D={(x, y)eR?:x=y}, and for z¢D the J,- and Z;- 
invariance of v, implies that v, is 7’?-invariant. Since U is uniquely ergodic, it 
follows that v,=H? for z¢D, and so we may conclude from (4) that 


En? =H? x (un-')? =p. 
By Krickeberg’s Theorem 5 in [6], we hence obtain n=py a.s., as required. 0 


4. Almost Periodic Systems 


In this section we assume that S=(K x R)*, where K =[0, 1) and deN, while 7 is 
given by 


T,(q,p)=(q+tp.p), qeK*, peR’, (1) 


the addition in K being modulo 1. Note that (1) is equivalent, for fixed p, to a 
rotation of the d-torus. More general flows on tori are considered e.g. in [4]. 
For any bounded pe.M(R), we shall say that yw enjoys the Riemann-Lebesgue 


property and write 1e.Mpz,, if its characteristic function fi vanishes at infinity, i.e. 
if 


lim fi(t)=lim f e!* u(dx)=0. (2) 
too t-0o 
By the Riemann-Lebesgue lemma, this holds in particular when yp is absolutely 


continuous. For other examples, see e.g. [13], p. 20. Putting A,(dx)=(dx/n)/uR, 
it follows from (2) that 


lim 4,(t)= lim fe'*1,(dx)=0, +0. (3) 


n—©o 


We define A’, to be the class of sequences {A,} < /(R) satisfying (3). 
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Letting y, be the uniform distribution on [0,r], we get for fixed t+0 


- ee | ame 
9,()=- fe**dx=—(e"-1)0, ro, 
ro itr 


so {y,}€A’,, and hence Ac A’, by Lemma 2.3. Since e’* is uniformly continuous 
in x for fixed t, it follows by a simple approximation argument that A, A. 
Applying this to the particular choice of {A,} above, we may conclude that 
M,,<— Mp,. This inclusion is in fact strict, as shown in the Appendix. 

The following theorem characterizes convergence on S. Write 2 for the 
natural projection S— R“, and for yeR*, define 2,: R’— R by 1, x=x y where xy 
denotes the inner product of x and y. Put R’=R~ {0}. 


Theorem 4.1. Let ve W(R*) and {1,}€A’, be fixed. Then 

(i) uT,~'—*> some fi for all we W(S) with un! =v iff R'(vn>')€ Mp, for all 

zeZ!; 
(ii) uT,~'\—*+ some ji(A,) for all we N(S) with pn-' =v iff R'(va>")e My for 

all zeZ?. 


Comparing with the corresponding results in §§4 and 6 of [10], it is seen that 
Mp, and A‘, play the same roles here as do the less extensive classes “,, and 
A,, respectively in [10]. 

Note in particular that, if y is restricted to K* x {p} for some fixed pe R*, then 
uT,~'—*+fi(A,) is impossible unless » is Z-invariant from the beginning. This 
example shows that it is not enough, in general, to examine the ergodic com- 
ponents in order to decide whether a system converges or not. Since the weak 
convergence in Theorem 4.1 may be strengthened to set-wise convergence pro- 
vided that n»<H xv (cf. the proof of Theorem 2.2), the conclusion of (i) is 
equivalent in this case to Rényi stability of {T,~' A} for all Ae Combining the 
last two remarks, it is easy to show that stable sequences need not be decompos- 
able into mixing ones. 

The proof of (i) is based on the following lemma. 


Lemma 4.2. If ueMp,, then f ue Mp, for all fEL, (wu). 


The latter property may hence be taken as the definition of we.M,, for 
unbounded pe.M@(R). 


Proof. Since any function in L,(u) may be approximated in L, (yu) by continuous 
functions with bounded support, we may assume that f has these properties. 
Fixing e>0, we choose a>0 so large that f(x)=0 for |x|>a and moreover 
\f | u{x;|x|>a}<e. Writing f for the 2a-periodic continuation of f, we get p| f 
—f\|<e, so f may instead be taken to be continuous and periodic. But such f 
may be approximated uniformly and hence in L,(u) by trigonometric poly- 
nomials, and for the latter the conclusion is trivial. 


Proof of Theorem 4.1. (i) Let v be such as stated and let yx~! =v. Since {uT,~'} 
is clearly relatively weakly compact, it is enough to prove that (uT7,~')F 
converges for all Fe¥%,(S), and by Theorem 3.1 in [1] we need only consider 
functions F=fxg with feA,(K*) and geA,(R*). Since f can be uniformly 
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approximated by trigonometric polynomials with unit period in all variables, we 
may assume that f(q)=e7"'*4, geK*, for some fixed z¢Z*. Writing Hu, for the 
conditional distribution on K‘ induced by yp for fixed peR*, we get from (1) 
(uT,~')F =(uT,~")(f x g)=Sfe?™'** ¢(p) uT,~*(dq,dp) 

= ffer™@*" 9(p) u(dq,dp) 

=fe?*=? g(p) v(dp)(e?*"*4 yu, (dq) 

=fe?™"? 9(p) i,(2z) v(dp)=fe?*"*(gh, v) nz ' (dx), (4) 
where h,(p)=fi,(2mz). Noting that R’((gh,v)x;')e.M,g, by Lemma 4.2 since 
(gh, v)n>'<vn>", we thus obtain (uT,~') F >(gh, v)2>' {0}. 


Suppose conversely that 7T,~' converges weakly whenever px~! =v. Letting 
HU=5, x v, it is seen as above that, for any z¢Z‘, the limit 


lim fe?*"** va>1(dx)=a, (5) 


too 


exists. Thus vz>'{0}=a, by Theorem 3.2.3 in [13], and by (5) this entails 
R'(vaz)e Mp. 


(ii) Let {A,}€A\, and let ux-' =v. For f,(q)=e?*'*4 and for g¢F,(R*) we get 
from (3) and (4) by Fubini’s theorem and dominated convergence 


\(HT,~"\(F, x 8)4,(dt)=J4,(dt)[e?""*(gh, v) n> * (dx) 


=figh, v)nz*(dx)fer™"* A,(dt)— (gh, v) nz * {0}, 
so ff exists and is given by 


Hf, x g)=(gh, v) nz * {0}. 
Applying (6) to y? and noting that 
hz, 29(P1»P2)=(A”) py, po(Z1» 22) = Ap, (21) Ay, (22) =h,, (P1)h,,(P2), 
it is seen that pe exists and is given by 


Wf, fe. * 81 X82) = [81 x gaM(h,, x h,,) v7] 25,1, {0} 


21,22 
=(k, x K) Ti} {0}, 


where x, and x, denote the complex valued measures 


K,=(g;h,,v)mz,',  j=1,2. 

By (6), (7) and Theorem 2.7, 1 T,~!—*> fi(A,) holds iff 
(Ky X K2){(X1,X 2): X, +X, =0} =(K, x K){(x1, x2): x, =x, =0} (8) 
for all g,, g., z, and z,. If vz;' and v7;," are diffuse on R’, then so are x, and 


K,, and (8) follows by Fubini’s theorem. On the other hand, if vz>'{a}>0 for 
some ze€Z‘ and a+0, then 
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[vay* x va=}] {(x,,x2): x, +x, =O} = Y (var! {x})? >(vaz! {0})?, 


xeR 
which violates (8) for g, =g,=1, z, =—z,=z and w=d)xv. O 


An alternative proof of (ii) may be based on the facts that 7(q,,p,) 


x F(q>,P>) is F-minimal as. py? iff v=pn~' is such as stated, and that the 
minimal invariant sets are automatically uniquely ergodic in the present case, 
(cf. [18], p. 138). 

Our next theorem states when the limiting measure in Theorem 4.1 is 


invariant under arbitrary rotations. Let H be the normalized Haar measure on 
K‘. 


Theorem 4.3. Let ve W(R*) and {i,}€A', be arbitrary. Then f=H xv for all 
peN (S) with pn~' =v iff vx>'{0}=0 for all zeZ4~ {0}. 


For the corresponding unbounded systems, a similar result is true in one 
dimension (Theorem 3.2 in [8]; cf. pp. 64 and 109 in [6]), while the correspond- 
ing statements in higher dimensions seem to require local invariance (Theo- 
rem 4.3 in [10]; cf. Lemma 9 on p. 143 in [6], Theorem 6 in [16], and Theo- 
rem 3.1 in [8]). 


Proof. Using the above notations, we get for any g and z 
(H x v)(f, x g)=H f,vg=4o,. V8, 
so by (6), 7=H x v iff 
(gh, v) x; * {0} =5o,. vg. 
For z+0 the left side equals 0 if vz>' {0} =0, and for z=0 we get 


(ghov)m ' {0} =(gv)R’=vg, 


so (9) follows. If instead vx>'{0}>0 for some z+0, we get for u=6, xv and 
g=! 


(gh, v)n,* {0} =va>* {0} >0, 
which contradicts (9). 


5. Asymptotically Invariant Conditional Intensities 


For an unbounded system of non-interacting particles, asymptotic invariance in 
a suitable sense of the corresponding conditional intensity [9, 15] is known to 
imply that the system is asymptotically Cox (§5 in [9] and §3 in [10]; cf. §2.7 in 
[6] as well as [7, 8, 15, 16]). The following result for bounded systems is similar, 
except that the family of asymptotic distributions is somewhat larger. For 
simplicity, we restrict ourselves to product spaces S=U x V such that U and V 
are both compact metric spaces, and we consider a flow Z on S which is 
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induced by a flow on U (which will also be denoted by 7). The natural 
projection S—V is denoted by z. We refer to [7] for general background on 
random measures. Fix arbitrary meN and pe(0, 1). 


Theorem 5.1. Let €, be a p-thinning of some simple m-point process & on S, and 
write n for the conditional intensity of ¢,. Let the random variables t,, neN, be 
independent of n with distributions 2,, and let u,¢N(U), veV, be diffuse and such 
that 1, is measurable and a.e. (EEnx~') weakly continuous in v. Suppose that every 
distributional limit 4 of n T,-' satisfies 


f=, x5,)C(d0) (1) 


for some random measure { on V. Then 
oT, *—4+8(,), (2) 


o. a d : > ai . 
where € is such that En~'=én~' and, given En-', the U-positions of its atoms 
are a.s. independent and distributed according to {,}. 


For the applications we have in mind, the most important cases would be 
those of degenerate A, and of {A,}€A,,. In the latter case, (1) may sometimes be 
deduced from the fact that 4 is automatically stationary (cf. §6 in [10]). The 
connection with Cox processes is as follows. Suppose for simplicity that U 
=[0,1] and that y, equals Lebesgue measure A on U for all v. Then every Cox 
process on U x V which is directed by some random measure of the form A x ¢ 
has the conditional independence property of & in the theorem. Conversely, 
every point process on R, x V whose restrictions to [0,t] x V have this property 
with respect to Lebesgue measure for all t>0 must be a Cox process directed by 
some 1x ¢. (This follows from Corollary 8.5 in [7].) We finally point out that é 
must be thinned before we form the conditional intensity, since 4 would 
otherwise coincide with €, and (1) would be impossible. 


Proof. First note that 
EnT,”'S=EnS=pEES=pm<o, 


and hence that 4T,—' is relatively compact in distribution, (cf. Lemma 4.5 in 
[7]). Since the right-hand side of (1) is diffuse, and since the atom sizes of 
arbitrary measures on S are 7 -invariant, it follows that y is a.s. diffuse. Since €S 
=m, it is further seen that 


P[¢,B=0|B°C,J2P{¢,S=0}=p", BeA@\(S), 


so nSSp~-"<oo by Eq. (3.2) in [9]. 

It will be convenient to assume that the atoms of € have as. distinct V- 
positions. If they have not, they may be marked by 7-invariant, independent 
and uniformly distributed random variables in [0,1]. By [9], 7 will then be 
replaced by x A, where A denotes Lebesgue measure on [0, 1], so (1) will remain 
valid with ¢ replaced by ¢ x A. 
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Since 7 is diffuse, it follows by §§3-4 in [9] that 7,~' is the conditional 
intensity of €,7,~', and that these two quantities are related by the integral 
equation 


CAfx g)=EE, (fe TEE, 7, 'Y,)=En(foT) até, 7, *), (3) 


where (¢,,7,~'), is defined as in [7], p. 71. Combining (3) with (1), and noting 
that »x~' is J -invariant, we get 


pEéx-'=Eé n-'=Enn-'=f(u,xd,)n-' EC(dv)=EC. (4) 


Letting h,eF(U) and k,¢F(V), ieN, be such that the integrals of all h; xk; 
determine any bounded measure on S, and defining 


L(v)=p,h, kv), veV; f,=h,;xk;,-1xl, ieN, (5) 
we get by (4) 


flu, x 5,)D,,EC(dv)=ECD,, SECD,, 4) SECD,,,, =pEEn~'D,, ,=0, 
where D, denotes the discontinuity set of f, and therefore 
J(u, x6,)D,,C(dv)=0 as, ieN. 
By (5) and Lemma 4.4 in [7], we thus obtain 
nf;e T,,)—* fu, x 5,)f, (dv) =0, 
and since 9 is bounded, this extends to L,-convergence, and we get 
Elin(fi°T,)I|t.J70 in Ly,  ieN. (6) 


Let N’CN be such that the convergence in (6) is almost sure along N’, and let 
{t,} be any fixed sequence such that this convergence takes place for t,=t,. 
Then we get by (3) for any bounded g 


IC, (fx slSiglEinGeT,)|70, ieN (neN’). (7) 


Since {€, T,~'} is relatively compact in distribution, we may choose a further 
subsequence N” < N’ such that €,T,-'—*+some x (neN”), and putting 


C(fxg)=ExfEg(xz;), (8) 
it follows from Theorem 10.5 in [7] that, for any bounded and continuous g, 


C,,(*xg)—*+> C(-xg) = (neN”). (9) 
Now 


C(D,,x 1)=ExD,,=Exn-' D, =Eé,n~-'D, <pEEnx~'D,, ,=0, 
so by (9) and A7.3 in [7], 


C,,6,* 8) C(f,xg), ieN (neN”). 


Comparing this with (7), it is seen that the right-hand side must be zero, and 
writing C,=C(- xg), we thus obtain by (5) 
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C,(h, x k=C,(1 xl)=C,2-*1,=[p,h, k,(v) C,n-* (dv) 


= J(u, x 6,)(h; x k;)C,2-* (dv). 
By the choice of {h;xk;}, it follows that 


C,=J(u, x 6,)C,x~* (dv), (10) 


and this extends immediately to arbitrary measurable g=>0. 

We now consider arbitrary measurable sets 1c U, JcJ’cV, Mc.M(S), and 
put B=z~'J’. On the event {yB=1} we denote by o=(«, B)eU x V the position 
of the unique y-atom in B. Writing A = {y: »B=0, B‘ we M}, and using notations 
and elementary facts from the theory of Palm distributions (cf. §10 in [7]), we 
get by (8) and (10) 


P {ael, BeJ, ~B=1, B’yeM} 
=E[y( x J);~B=1,B°yeM]= | P{y,B=1, Bx,eM}Ex(ds) 
IxJ 


= J P{(x,—6)B=0, B'(y,—6 eM} Ex(ds)= | P{z,—5,¢A} Ex(ds) 
IxJ 


IxJ 


=C(IxJ x A)=fu,16,JC,x-'(dv)=[u,1C,n-'(dv)=[u,1C(U xdvx A), 
J J 


and in particular 
P {BeJ,7~B=1, B yeM}=C(UxJ x A). 
By the definition of conditional probabilities, we thus obtain 
P[ae-|B,~—6,]=P[ae-|B, 7B, BoyJ=u, as. on {yB=1}. 
Since J’ was arbitrary, and since yx~' is distributed as €,2~' and hence is a.s. 


simple, it follows that 1=6, where <a is formed from ¢,, just as € from €. Thus 


7,’ &, along N”, and since the limit is independent of the choice of 
subsequence, we may conclude from Theorem 2.3 in [1] that 


i | (11) 


Now it is obvious that &, is distributed as a p-thinning of € (This can easily 
be verified computationally by means of Laplace transforms, cf. §1 in [7].) By 
Exercise 4.5 in [7], we may thus conclude from (11) that 


cF*-4ot wen). 


Since the limit is independent of the choice of sequence {t,}, it follows by 
dominated convergence that 


ET, '—4+&(2,) (neN’), 


and this remains true along N, since the limit does not depend on N’ either, (cf. 
Al.2 in [7]). O 
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The present argument simplifies the proof of Lemma 4.3 in [8] and yields an 
extension of that result to more general decompositions of Eé. Proceeding as 
above, we may further strengthen the conclusions of Lemma 3.4 in [10] to mean 
convergence. The latter remark is implicit in the introduction of §6 in [10]. 


Appendix 


According to the classical Lebesgue decomposition theorem, every measure p on 
R may be written uniquely in the form w="~+Hs+pH,, where pc is absolutely 
continuous, fs is diffuse but singular, and pw, is purely atomic. It may be 
interesting to notice that the local invariance and Riemann-Lebesgue properties 
induce a further decomposition of 1, into three components. To state our result, 
write M,- and Mp, for the classes of absolutely continuous and diffuse measures, 
and let “,, and M,, be the extensions to possibly unbounded measures of the 
classes of locally invariant and Riemann-Lebesgue measures. Put .@(R)=.@. By 
Lv we mean that the measures yu and v have disjoint supporting sets, and by 
uLM that uty for all ve.@’. A decomposition (rule) uw=p' +p” is said to be 
persistent under absolute continuity, if (fw)’=fw' for all we. and feL, (wu). 
Persistence under addition and monotone convergence is defined analogously. 
Put 


Mo=9, My=Mye, Mn =Myy My=Mpy, Mg=My, Ms=M. 


Theorem A.1. Every measure 4€.M may be written uniquely in the form 


H=pyt... +H, 


where p;€.M, while w;1M;_,, i=1,...,5. This decomposition is persistent under 
absolute continuity, addition, and monotone convergence. All five components may 
be non-zero. 


The first two assertions follow by straightforward elementary reasoning, once 
it is realized that the classes 4 ,...,. @, are ordered by inclusion and are closed 
under absolute continuity, addition, and monotone convergence, (cf. Lemma 2.2 
in [10] and Lemma 4.2 above). The decomposition itself is even a direct 
consequence of §1, No. 5 in [2]. (In fact, the classes ./@, are bands in the sense of 
{2].) A measure in “@,,;~ M4 - was constructed in Lemma 2.1 of [10], and 
examples of elements in @)~.M,, are in [13], p. 20. It remains to produce a 
measure in M,,~.%,,. Such measures are rather easy to construct, but the 
Riemann-Lebesgue property may often be hard to verify. We prefer to present a 
rather complex example where the proof is easy. For the sake of brevity, we 
omit some details. 


Example. For neN, put 


g,=n*, a,=(al)®, v,=n-5@l)®, T,=x/2v,, 


and define 
f,(t)=T,,—' cos(a, +, t+, sin v, t) 
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for te[0,T,] and otherwise f,(t)=0, where «, is any number such that {f,(t)dt 
=0. Next define go =1jo,,, and, recursively, 


&5=S4-4 * ¥ cu f(t—kT,, neN, 
k=1 


where in each step the c,, are chosen to be the largest real numbers compatible 


with g,20. Then g, is the density of some probability measure p,. By 
Lemma A.2 below, 


lay — By 11S 2 Cnt -" \fn(A)| S2sup Iful4)| =O(@, *°)=O(n-**) (1) 
k 


for large n, so the #, converge uniformly. Moreover, f,(A)>0 as 1-0 for fixed 
n, so it follows by dominated convergence and the continuity theorem for 
characteristic functions that y,—*+some yp. By the Riemann-Lebesgue lemma 
and dominated convergence it is further seen that ne. Mp,. 

In order to prove that w¢./,,, it is enough to show that 


limsup ||um_* 


n—0o 


* Uyo— pm, * * Uo * 5, || >0, (2) 


where m,x=tx and t,=q,/z. Since the distribution functions of y,, u,,,,,... and 

hence those of yu, and yp (since we.Mp) coincide at the points kT,,,,, keN, and 

since t,, T,, 0, we get asymptotically the same total variation in (2) if we replace 

Lt by y,. If u is instead replaced by y,,_,, we get asymptotically zero variation, so 

the main contribution comes from y,,—,_,, and since }\c,,— 1, we may even 
k 


replace in (2) by the signed measure p, with density f,. Now @,/v, 9, =n— ©, 
so the density of p,m,' behaves like (2v,/m,)sinzt on [0,@,/2v,]—R,, apart 
from some frequency modulation which tends to zero as n— oo. Thus the total 
variation in (2) is asymptotically 


1 2 
fsinn(t+u)dt—fsinn(t+u)dt|du 
0 1 


4} g 12 8 
=—Jf|cosnu|du=— {| cosnudu=—5>0. O 
To TN oO Tt 


We conclude by establishing the following estimate which was used in (1). 
Lemma A.2. For any a, v, «, p>0, define 
f (t)=vcos(a+@t+@sinvt) 
for te[0,z/2v] and otherwise f (t)=0. Then 
fAa=se*fdt=0¢e""*), ga, 
uniformly in «, w, v and A. 
This is related to some results for Bessel functions in [19], pp. 231, 257ff. 


Proof. A change of variable reduces the discussion to the case v=1, and then 
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n/2 
Ref(4)= | cosAtcos(x+mt+gsint)dt 
0 


n/2 
=} { cos(a+(m+A)t+qsint)dt 
0 


n/2 
+4 f cos(a+(m—A)t+osint)dt, 
0 


so it suffices to estimate 


n/2 
J cos(a+wt+ gsint)dt. (3) 
0 


Since the function «+@t+q@sint is concave, the contributions from the suc- 
cessive intervals where the integrand is positive or negative are monotonic in 
absolute value on each of the two sub-intervals separated by the solution of w 
+@cost=0 (if a solution exists and otherwise on the whole domain). Thus (3) is 
bounded in absolute value by the total length of the two longest of these 
intervals, and so we get by elementary calculations the uniform bound 
2(32/g)'’?, valid for sufficiently large gy. O 
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